diff --git a/configs/selfsup/mae/README.md b/configs/selfsup/mae/README.md index 41599b7a..caeec51c 100644 --- a/configs/selfsup/mae/README.md +++ b/configs/selfsup/mae/README.md @@ -29,12 +29,118 @@ methods that use only ImageNet-1K data. Transfer performance in downstream tasks ## Models and Benchmarks -Here, we report the results of the model, which is pre-trained on ImageNet-1k -for 400 epochs, the details are below: - -| Backbone | Pre-train epoch | Fine-tuning Top-1 | Pre-train Config | Fine-tuning Config | Download | -| :------: | :-------------: | :---------------: | :-----------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| ViT-B/16 | 400 | 83.1 | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/selfsup/mae/mae_vit-b-p16_8xb512-coslr-400e_in1k.py) | [config](https://github.com/open-mmlab/mmselfsup/blob/master/configs/benchmarks/classification/imagenet/vit-base-p16_ft-8xb128-coslr-100e_in1k.py) | [model](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-400e_in1k-224_20220223-85be947b.pth) \| [log](https://download.openmmlab.com/mmselfsup/mae/mae_vit-base-p16_8xb512-coslr-300e_in1k-224_20220210_140925.log.json) | + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AlgorithmBackboneEpochBatch SizeResults (Top-1 %)Links
Linear EvalFine-tuningPretrainLinear EvalFine-tuning
MAEViT-base300409660.883.1config | model | logconfig | model | logconfig | model | log
ViT-base400409662.583.3config | model | logconfig | model | logconfig | model | log
ViT-base800409665.183.3config | model | logconfig | model | logconfig | model | log
ViT-base1600409667.183.5config | model | logconfig | model | logconfig | model | log
ViT-large400409670.785.2config | model | logconfig | model | logconfig | model | log
ViT-large800409673.785.4config | model | logconfig | model | logconfig | model | log
ViT-large1600409675.585.7config | model | logconfig | model | logconfig | model | log
ViT-huge-FT-22416004096/86.9config | model | log/config | model | log
ViT-huge-FT-44816004096/87.3config | model | log/config | model | log
## Citation diff --git a/docs/en/model_zoo.md b/docs/en/model_zoo.md index d8d0b29e..dd7f3130 100644 --- a/docs/en/model_zoo.md +++ b/docs/en/model_zoo.md @@ -230,7 +230,7 @@ ImageNet has multiple versions, but the most commonly used one is ILSVRC 2012. T config | model | log - MAE + MAE ViT-base 300 4096 @@ -300,6 +300,26 @@ ImageNet has multiple versions, but the most commonly used one is ILSVRC 2012. T config | model | log config | model | log + + ViT-huge-FT-224 + 1600 + 4096 + / + 86.9 + config | model | log + / + config | model | log + + + ViT-huge-FT-448 + 1600 + 4096 + / + 87.3 + config | model | log + / + config | model | log + CAE ViT-base