5.0 KiB
Convention in MMPretrain
Model Naming Convention
We follow the below convention to name models. Contributors are advised to follow the same style. The model names are divided into five parts: algorithm info, module information, pretrain information, training information and data information. Logically, different parts are concatenated by underscores '_'
, and words in the same part are concatenated by dashes '-'
.
{algorithm info}_{module info}_{pretrain info}_{training info}_{data info}
algorithm info
(optional): The main algorithm information, it's includes the main training algorithms like MAE, BEiT, etc.module info
: The module information, it usually includes the backbone name, such as resnet, vit, etc.pretrain info
: (optional): The pretrain model information, such as the pretrain model is trained on ImageNet-21k.training info
: The training information, some training schedule, including batch size, lr schedule, data augment and the like.data info
: The data information, it usually includes the dataset name, input size and so on, such as imagenet, cifar, etc.
Algorithm information
The main algorithm name to train the model. For example:
simclr
mocov2
eva-mae-style
The model trained by supervised image classification can omit this field.
Module information
The modules of the model, usually, the backbone must be included in this field, and the neck and head information can be omitted. For example:
resnet50
vit-base-p16
swin-base
Pretrain information
If the model is a fine-tuned model from a pre-trained model, we need to record some information of the pre-trained model. For example:
- The source of the pre-trained model:
fb
,openai
, etc. - The method to train the pre-trained model:
clip
,mae
,distill
, etc. - The dataset used for pre-training:
in21k
,laion2b
, etc. (in1k
can be omitted.) - The training duration:
300e
,1600e
, etc.
Not all information is necessary, only select the necessary information to distinguish different pre-trained models.
At the end of this field, use a -pre
as an identifier, like mae-in21k-pre
.
Training information
Training schedule, including training type, batch size
, lr schedule
, data augment, special loss functions and so on:
- format
{gpu x batch_per_gpu}
, such as8xb32
Training type (mainly seen in the transformer network, such as the ViT
algorithm, which is usually divided into two training type: pre-training and fine-tuning):
ft
: configuration file for fine-tuningpt
: configuration file for pretraining
Training recipe. Usually, only the part that is different from the original paper will be marked. These methods will be arranged in the order {pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}
.
coslr-200e
: use cosine scheduler to train 200 epochsautoaug-mixup-lbs-coslr-50e
: useautoaug
,mixup
,label smooth
,cosine scheduler
to train 50 epochs
If the model is converted from a third-party repository like the official repository, the training information
can be omitted and use a 3rdparty
as an identifier.
Data information
in1k
:ImageNet1k
dataset, default to use the input image size of 224x224;in21k
:ImageNet21k
dataset, also calledImageNet22k
dataset, default to use the input image size of 224x224;in1k-384px
: Indicates that the input image size is 384x384;cifar100
Model Name Example
vit-base-p32_clip-openai-pre_3rdparty_in1k
vit-base-p32
: The module informationclip-openai-pre
: The pre-train information.clip
: The pre-train method is clip.openai
: The pre-trained model is come from OpenAI.pre
: The pre-train identifier.
3rdparty
: The model is converted from a third-party repository.in1k
: Dataset information. The model is trained from ImageNet-1k dataset and the input size is224x224
.
beit_beit-base-p16_8xb256-amp-coslr-300e_in1k
beit
: The algorithm informationbeit-base
: The module information, since the backbone is a modified ViT from BEiT, the backbone name is alsobeit
.8xb256-amp-coslr-300e
: The training information.8xb256
: Use 8 GPUs and the batch size on each GPU is 256.amp
: Use automatic-mixed-precision training.coslr
: Use cosine annealing learning rate scheduler.300e
: To train 300 epochs.
in1k
: Dataset information. The model is trained from ImageNet-1k dataset and the input size is224x224
.
Config File Naming Convention
The naming of the config file is almost the same with the model name, with several difference:
- The training information is necessary, and cannot be
3rdparty
. - If the config file only includes backbone settings, without neither head settings nor dataset settings. We
will name it as
{module info}_headless.py
. This kind of config files are usually used for third-party pre-trained models on large datasets.
Checkpoint Naming Convention
The naming of the weight mainly includes the model name, date and hash value.
{model_name}_{date}-{hash}.pth