121 lines
5.0 KiB
Markdown
121 lines
5.0 KiB
Markdown
# Convention in MMPretrain
|
|
|
|
## Model Naming Convention
|
|
|
|
We follow the below convention to name models. Contributors are advised to follow the same style. The model names are divided into five parts: algorithm info, module information, pretrain information, training information and data information. Logically, different parts are concatenated by underscores `'_'`, and words in the same part are concatenated by dashes `'-'`.
|
|
|
|
```text
|
|
{algorithm info}_{module info}_{pretrain info}_{training info}_{data info}
|
|
```
|
|
|
|
- `algorithm info` (optional): The main algorithm information, it's includes the main training algorithms like MAE, BEiT, etc.
|
|
- `module info`: The module information, it usually includes the backbone name, such as resnet, vit, etc.
|
|
- `pretrain info`: (optional): The pretrain model information, such as the pretrain model is trained on ImageNet-21k.
|
|
- `training info`: The training information, some training schedule, including batch size, lr schedule, data augment and the like.
|
|
- `data info`: The data information, it usually includes the dataset name, input size and so on, such as imagenet, cifar, etc.
|
|
|
|
### Algorithm information
|
|
|
|
The main algorithm name to train the model. For example:
|
|
|
|
- `simclr`
|
|
- `mocov2`
|
|
- `eva-mae-style`
|
|
|
|
The model trained by supervised image classification can omit this field.
|
|
|
|
### Module information
|
|
|
|
The modules of the model, usually, the backbone must be included in this field, and the neck and head
|
|
information can be omitted. For example:
|
|
|
|
- `resnet50`
|
|
- `vit-base-p16`
|
|
- `swin-base`
|
|
|
|
### Pretrain information
|
|
|
|
If the model is a fine-tuned model from a pre-trained model, we need to record some information of the
|
|
pre-trained model. For example:
|
|
|
|
- The source of the pre-trained model: `fb`, `openai`, etc.
|
|
- The method to train the pre-trained model: `clip`, `mae`, `distill`, etc.
|
|
- The dataset used for pre-training: `in21k`, `laion2b`, etc. (`in1k` can be omitted.)
|
|
- The training duration: `300e`, `1600e`, etc.
|
|
|
|
Not all information is necessary, only select the necessary information to distinguish different pre-trained
|
|
models.
|
|
|
|
At the end of this field, use a `-pre` as an identifier, like `mae-in21k-pre`.
|
|
|
|
### Training information
|
|
|
|
Training schedule, including training type, `batch size`, `lr schedule`, data augment, special loss functions and so on:
|
|
|
|
- format `{gpu x batch_per_gpu}`, such as `8xb32`
|
|
|
|
Training type (mainly seen in the transformer network, such as the `ViT` algorithm, which is usually divided into two training type: pre-training and fine-tuning):
|
|
|
|
- `ft` : configuration file for fine-tuning
|
|
- `pt` : configuration file for pretraining
|
|
|
|
Training recipe. Usually, only the part that is different from the original paper will be marked. These methods will be arranged in the order `{pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}`.
|
|
|
|
- `coslr-200e` : use cosine scheduler to train 200 epochs
|
|
- `autoaug-mixup-lbs-coslr-50e` : use `autoaug`, `mixup`, `label smooth`, `cosine scheduler` to train 50 epochs
|
|
|
|
If the model is converted from a third-party repository like the official repository, the training information
|
|
can be omitted and use a `3rdparty` as an identifier.
|
|
|
|
### Data information
|
|
|
|
- `in1k` : `ImageNet1k` dataset, default to use the input image size of 224x224;
|
|
- `in21k` : `ImageNet21k` dataset, also called `ImageNet22k` dataset, default to use the input image size of 224x224;
|
|
- `in1k-384px` : Indicates that the input image size is 384x384;
|
|
- `cifar100`
|
|
|
|
### Model Name Example
|
|
|
|
```text
|
|
vit-base-p32_clip-openai-pre_3rdparty_in1k
|
|
```
|
|
|
|
- `vit-base-p32`: The module information
|
|
- `clip-openai-pre`: The pre-train information.
|
|
- `clip`: The pre-train method is clip.
|
|
- `openai`: The pre-trained model is come from OpenAI.
|
|
- `pre`: The pre-train identifier.
|
|
- `3rdparty`: The model is converted from a third-party repository.
|
|
- `in1k`: Dataset information. The model is trained from ImageNet-1k dataset and the input size is `224x224`.
|
|
|
|
```text
|
|
beit_beit-base-p16_8xb256-amp-coslr-300e_in1k
|
|
```
|
|
|
|
- `beit`: The algorithm information
|
|
- `beit-base`: The module information, since the backbone is a modified ViT from BEiT, the backbone name is
|
|
also `beit`.
|
|
- `8xb256-amp-coslr-300e`: The training information.
|
|
- `8xb256`: Use 8 GPUs and the batch size on each GPU is 256.
|
|
- `amp`: Use automatic-mixed-precision training.
|
|
- `coslr`: Use cosine annealing learning rate scheduler.
|
|
- `300e`: To train 300 epochs.
|
|
- `in1k`: Dataset information. The model is trained from ImageNet-1k dataset and the input size is `224x224`.
|
|
|
|
## Config File Naming Convention
|
|
|
|
The naming of the config file is almost the same with the model name, with several difference:
|
|
|
|
- The training information is necessary, and cannot be `3rdparty`.
|
|
- If the config file only includes backbone settings, without neither head settings nor dataset settings. We
|
|
will name it as `{module info}_headless.py`. This kind of config files are usually used for third-party
|
|
pre-trained models on large datasets.
|
|
|
|
## Checkpoint Naming Convention
|
|
|
|
The naming of the weight mainly includes the model name, date and hash value.
|
|
|
|
```text
|
|
{model_name}_{date}-{hash}.pth
|
|
```
|