# Convention in MMPretrain ## Model Naming Convention We follow the below convention to name models. Contributors are advised to follow the same style. The model names are divided into five parts: algorithm info, module information, pretrain information, training information and data information. Logically, different parts are concatenated by underscores `'_'`, and words in the same part are concatenated by dashes `'-'`. ```text {algorithm info}_{module info}_{pretrain info}_{training info}_{data info} ``` - `algorithm info` (optional): The main algorithm information, it's includes the main training algorithms like MAE, BEiT, etc. - `module info`: The module information, it usually includes the backbone name, such as resnet, vit, etc. - `pretrain info`: (optional): The pretrain model information, such as the pretrain model is trained on ImageNet-21k. - `training info`: The training information, some training schedule, including batch size, lr schedule, data augment and the like. - `data info`: The data information, it usually includes the dataset name, input size and so on, such as imagenet, cifar, etc. ### Algorithm information The main algorithm name to train the model. For example: - `simclr` - `mocov2` - `eva-mae-style` The model trained by supervised image classification can omit this field. ### Module information The modules of the model, usually, the backbone must be included in this field, and the neck and head information can be omitted. For example: - `resnet50` - `vit-base-p16` - `swin-base` ### Pretrain information If the model is a fine-tuned model from a pre-trained model, we need to record some information of the pre-trained model. For example: - The source of the pre-trained model: `fb`, `openai`, etc. - The method to train the pre-trained model: `clip`, `mae`, `distill`, etc. - The dataset used for pre-training: `in21k`, `laion2b`, etc. (`in1k` can be omitted.) - The training duration: `300e`, `1600e`, etc. Not all information is necessary, only select the necessary information to distinguish different pre-trained models. At the end of this field, use a `-pre` as an identifier, like `mae-in21k-pre`. ### Training information Training schedule, including training type, `batch size`, `lr schedule`, data augment, special loss functions and so on: - format `{gpu x batch_per_gpu}`, such as `8xb32` Training type (mainly seen in the transformer network, such as the `ViT` algorithm, which is usually divided into two training type: pre-training and fine-tuning): - `ft` : configuration file for fine-tuning - `pt` : configuration file for pretraining Training recipe. Usually, only the part that is different from the original paper will be marked. These methods will be arranged in the order `{pipeline aug}-{train aug}-{loss trick}-{scheduler}-{epochs}`. - `coslr-200e` : use cosine scheduler to train 200 epochs - `autoaug-mixup-lbs-coslr-50e` : use `autoaug`, `mixup`, `label smooth`, `cosine scheduler` to train 50 epochs If the model is converted from a third-party repository like the official repository, the training information can be omitted and use a `3rdparty` as an identifier. ### Data information - `in1k` : `ImageNet1k` dataset, default to use the input image size of 224x224; - `in21k` : `ImageNet21k` dataset, also called `ImageNet22k` dataset, default to use the input image size of 224x224; - `in1k-384px` : Indicates that the input image size is 384x384; - `cifar100` ### Model Name Example ```text vit-base-p32_clip-openai-pre_3rdparty_in1k ``` - `vit-base-p32`: The module information - `clip-openai-pre`: The pre-train information. - `clip`: The pre-train method is clip. - `openai`: The pre-trained model is come from OpenAI. - `pre`: The pre-train identifier. - `3rdparty`: The model is converted from a third-party repository. - `in1k`: Dataset information. The model is trained from ImageNet-1k dataset and the input size is `224x224`. ```text beit_beit-base-p16_8xb256-amp-coslr-300e_in1k ``` - `beit`: The algorithm information - `beit-base`: The module information, since the backbone is a modified ViT from BEiT, the backbone name is also `beit`. - `8xb256-amp-coslr-300e`: The training information. - `8xb256`: Use 8 GPUs and the batch size on each GPU is 256. - `amp`: Use automatic-mixed-precision training. - `coslr`: Use cosine annealing learning rate scheduler. - `300e`: To train 300 epochs. - `in1k`: Dataset information. The model is trained from ImageNet-1k dataset and the input size is `224x224`. ## Config File Naming Convention The naming of the config file is almost the same with the model name, with several difference: - The training information is necessary, and cannot be `3rdparty`. - If the config file only includes backbone settings, without neither head settings nor dataset settings. We will name it as `{module info}_headless.py`. This kind of config files are usually used for third-party pre-trained models on large datasets. ## Checkpoint Naming Convention The naming of the weight mainly includes the model name, date and hash value. ```text {model_name}_{date}-{hash}.pth ```