mmselfsup/docs/en/model_zoo.md

44 KiB

Model Zoo

All models and part of benchmark results are recorded below.

Statistics

  • Number of papers: 17

  • Number of checkpoints: 60

Benchmarks

ImageNet

ImageNet has multiple versions, but the most commonly used one is ILSVRC 2012. The classification results below are reported by linear evaluation or fine-tuning with pre-trained weights provided by various algorithms.

Algorithm Backbone Epoch Batch Size Results (Top-1 %) Links
Linear Eval Fine-tuning Pretrain Linear Eval Fine-tuning
Relative-Loc ResNet50 70 512 40.4 / config | model | log config | model | log /
Rotation-Pred ResNet50 70 128 47.0 / config | model | log config | model | log /
NPID ResNet50 200 256 58.3 / config | model | log config | model | log /
SimCLR ResNet50 200 256 62.7 / config | model | log config | model | log /
ResNet50 200 4096 66.9 / config | model | log config | model | log /
ResNet50 800 4096 69.2 / config | model | log config | model | log /
MoCo v2 ResNet50 200 256 67.5 / config | model | log config | model | log /
BYOL ResNet50 200 4096 71.8 / config | model | log config | model | log /
SwAV ResNet50 200 256 70.5 / config | model | log config | model | log /
DenseCL ResNet50 200 256 63.5 / config | model | log config | model | log /
SimSiam ResNet50 100 256 68.3 / config | model | log config | model | log /
ResNet50 200 256 69.8 / config | model | log config | model | log /
BarlowTwins ResNet50 300 2048 71.8 / config | model | log config | model | log /
MoCo v3 ResNet50 100 4096 69.6 / config | model | log config | model | log /
ResNet50 300 4096 72.8 / config | model | log config | model | log /
ResNet50 800 4096 74.4 / config | model | log config | model | log /
ViT-small 300 4096 73.6 / config | model | log config | model | log /
ViT-base 300 4096 76.9 83.0 config | model | log config | model | log config | model | log
ViT-large 300 4096 / 83.7 config | model | log / config | model | log
MAE ViT-base 300 4096 60.8 83.1 config | model | log config | model | log config | model | log
ViT-base 400 4096 62.5 83.3 config | model | log config | model | log config | model | log
ViT-base 800 4096 65.1 83.3 config | model | log config | model | log config | model | log
ViT-base 1600 4096 67.1 83.5 config | model | log config | model | log config | model | log
ViT-large 400 4096 70.7 85.2 config | model | log config | model | log config | model | log
ViT-large 800 4096 73.7 85.4 config | model | log config | model | log config | model | log
ViT-large 1600 4096 75.5 85.7 config | model | log config | model | log config | model | log
ViT-huge-FT-224 1600 4096 / 86.9 config | model | log / config | model | log
ViT-huge-FT-448 1600 4096 / 87.3 config | model | log / config | model | log
CAE ViT-base 300 2048 / 83.3 config | model | log / config | model | log
SimMIM Swin-base-FT192 100 2048 / 82.7 config | model | log / config | model | log
Swin-base-FT224 100 2048 / 83.5 config | model | log / config | model | log
Swin-base-FT224 800 2048 / 83.8 config | model | log / config | model | log
Swin-large-FT224 800 2048 / 84.8 config | model | log / config | model | log