mirror of
https://github.com/open-mmlab/mmselfsup.git
synced 2025-06-03 14:59:38 +08:00
* [Feature] Support MixMIM Pretrain and Finetuning * [Feature] Fix Lint * [Feature] Add doc string for MixMIMTransformerPretrain * [Feature] Add doc string for MixMIMPretrainHead * [Feature] Add README * [Feature] Fix Config * [Feature] Fix Config * [Feature] Add doc string to mixmim neck and bacbone * [Feature] Add doc string to mixmim neck and bacbone * [Feature] Support MixMIM Pretrain and Finetuning * [Feature] Fix Lint * [Feature] Support MixMIM Pretrain and Finetuning * [Feature] Fix Lint * [Feature] Support MixMIM Pretrain and Finetuning * [Feature] Fix Lint * [Feature] Support MixMIM Pretrain and Finetuning * [Feature] Fix Lint * [Feature] Replace MixMIMTransformer with import from mmcls * add an explanation of the lr * add an explanation of the lr * [Feature] Fix lint * [Feature] Modification after Review * [Feature] Modification after Review2 * [Feature] Modification after Review2 * [Feature] Modification after Review2 * [Feature] Modification after Review3 * [Feature] Fix lint Co-authored-by: WasedaMagina <33023171+WasedaMagina@users.noreply.github.com>
52 KiB
52 KiB
Model Zoo
All models and part of benchmark results are recorded below.
Benchmarks
ImageNet
ImageNet has multiple versions, but the most commonly used one is ILSVRC 2012. The classification results below are reported by linear evaluation or fine-tuning with pre-trained weights provided by various algorithms.
Algorithm | Backbone | Epoch | Batch Size | Results (Top-1 %) | Links | |||
---|---|---|---|---|---|---|---|---|
Linear Eval | Fine-tuning | Pretrain | Linear Eval | Fine-tuning | ||||
Relative-Loc | ResNet50 | 70 | 512 | 40.4 | / | config | model | log | config | model | log | / |
Rotation-Pred | ResNet50 | 70 | 128 | 47.0 | / | config | model | log | config | model | log | / |
NPID | ResNet50 | 200 | 256 | 58.3 | / | config | model | log | config | model | log | / |
SimCLR | ResNet50 | 200 | 256 | 62.7 | / | config | model | log | config | model | log | / |
ResNet50 | 200 | 4096 | 66.9 | / | config | model | log | config | model | log | / | |
ResNet50 | 800 | 4096 | 69.2 | / | config | model | log | config | model | log | / | |
MoCo v2 | ResNet50 | 200 | 256 | 67.5 | / | config | model | log | config | model | log | / |
BYOL | ResNet50 | 200 | 4096 | 71.8 | / | config | model | log | config | model | log | / |
SwAV | ResNet50 | 200 | 256 | 70.5 | / | config | model | log | config | model | log | / |
DenseCL | ResNet50 | 200 | 256 | 63.5 | / | config | model | log | config | model | log | / |
SimSiam | ResNet50 | 100 | 256 | 68.3 | / | config | model | log | config | model | log | / |
ResNet50 | 200 | 256 | 69.8 | / | config | model | log | config | model | log | / | |
BarlowTwins | ResNet50 | 300 | 2048 | 71.8 | / | config | model | log | config | model | log | / |
MoCo v3 | ResNet50 | 100 | 4096 | 69.6 | / | config | model | log | config | model | log | / |
ResNet50 | 300 | 4096 | 72.8 | / | config | model | log | config | model | log | / | |
ResNet50 | 800 | 4096 | 74.4 | / | config | model | log | config | model | log | / | |
ViT-small | 300 | 4096 | 73.6 | / | config | model | log | config | model | log | / | |
ViT-base | 300 | 4096 | 76.9 | 83.0 | config | model | log | config | model | log | config | model | log | |
ViT-large | 300 | 4096 | / | 83.7 | config | model | log | / | config | model | log | |
MAE | ViT-base | 300 | 4096 | 60.8 | 83.1 | config | model | log | config | model | log | config | model | log |
ViT-base | 400 | 4096 | 62.5 | 83.3 | config | model | log | config | model | log | config | model | log | |
ViT-base | 800 | 4096 | 65.1 | 83.3 | config | model | log | config | model | log | config | model | log | |
ViT-base | 1600 | 4096 | 67.1 | 83.5 | config | model | log | config | model | log | config | model | log | |
ViT-large | 400 | 4096 | 70.7 | 85.2 | config | model | log | config | model | log | config | model | log | |
ViT-large | 800 | 4096 | 73.7 | 85.4 | config | model | log | config | model | log | config | model | log | |
ViT-large | 1600 | 4096 | 75.5 | 85.7 | config | model | log | config | model | log | config | model | log | |
ViT-huge-FT-224 | 1600 | 4096 | / | 86.9 | config | model | log | / | config | model | log | |
ViT-huge-FT-448 | 1600 | 4096 | / | 87.3 | config | model | log | / | config | model | log | |
CAE | ViT-base | 300 | 2048 | / | 83.3 | config | model | log | / | config | model | log |
SimMIM | Swin-base-FT192 | 100 | 2048 | / | 82.7 | config | model | log | / | config | model | log |
Swin-base-FT224 | 100 | 2048 | / | 83.5 | config | model | log | / | config | model | log | |
Swin-base-FT224 | 800 | 2048 | / | 83.7 | config | model | log | / | config | model | log | |
Swin-large-FT224 | 800 | 2048 | / | 84.8 | config | model | log | / | config | model | log | |
MaskFeat | ViT-base | 300 | 2048 | / | 83.4 | config | model | log | / | config | model | log |
BEiT | ViT-base | 300 | 2048 | / | 83.1 | config | model | log | / | config | model | log |
MILAN | ViT-base | 400 | 4096 | 78.9 | 85.3 | config | model | log | config | model | log | config | model | log |
BEiT v2 | ViT-base | 300 | 2048 | / | 85.0 | config | model | log | / | config | model | log |
EVA | ViT-base | 400 | 4096 | 69.0 | 83.7 | config | model | log | config | model | log | config | model | log |
MixMIM | MixMIM-Base | 400 | 2048 | / | 84.6 | config | model | log | / | config | model | log |