[Reproduction] Reproduce training results of T2T-ViT (#610)

* Add cosine cool down lr updater * Use ema hook * Update decay mult * Update configs. * Update T2T-ViT readme and format all readme * Update swin readme * Update tnt readme * Add docstring for `CosineAnnealingCooldownLrUpdaterHook`. * Update t2t readme and metafile
2021-12-28 15:09:40 +08:00 · 2021-12-28 15:09:40 +08:00 · 159b38d276
parent d1d9ebfc67
commit 159b38d276
27 changed files with 434 additions and 222 deletions
--- a/configs/_base_/datasets/imagenet_bs64_t2t_224.py
+++ b/configs/_base_/datasets/imagenet_bs64_t2t_224.py
@ -68,4 +68,4 @@ data = dict(
        ann_file='data/imagenet/meta/val.txt',
        pipeline=test_pipeline))

-evaluation = dict(interval=10, metric='accuracy')
+evaluation = dict(interval=1, metric='accuracy', save_best='auto')
--- a/configs/conformer/README.md
+++ b/configs/conformer/README.md
@ -25,15 +25,13 @@ Within Convolutional Neural Network (CNN), the convolution operations are good a

 ## Results and models

-Some pre-trained models are converted from [official repo](https://github.com/pengzhiliang/Conformer).
-
-## ImageNet-1k
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
-| Conformer-tiny-p16\*  |  23.52    | 4.90 | 81.31 | 95.60 | [config](configs/conformer/conformer-tiny-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth) |
-| Conformer-small-p32   |  38.85    | 7.09 | 81.96 | 96.02 | [config](configs/conformer/conformer-small-p32_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth) |
-| Conformer-small-p16\* |  37.67    | 10.31 | 83.32 | 96.46 | [config](configs/conformer/conformer-small-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth) |
-| Conformer-base-p16\*  |  83.29    | 22.89 | 83.82 | 96.59 | [config](configs/conformer/conformer-base-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth) |
+| Conformer-tiny-p16\*  |  23.52    | 4.90     | 81.31     | 95.60     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-tiny-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth) |
+| Conformer-small-p32\* |  38.85    | 7.09     | 81.96     | 96.02     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p32_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth) |
+| Conformer-small-p16\* |  37.67    | 10.31    | 83.32     | 96.46     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth) |
+| Conformer-base-p16\*  |  83.29    | 22.89    | 83.82     | 96.59     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-base-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth) |

-*Models with \* are converted from other repos.*
+*Models with \* are converted from the [official repo](https://github.com/pengzhiliang/Conformer). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/deit/README.md
+++ b/configs/deit/README.md
@ -25,35 +25,24 @@ Recently, neural networks purely based on attention were shown to address image
 }
 ```

-## Pretrained models
-
-The pre-trained models are converted from the [official repo](https://github.com/facebookresearch/deit). And the teacher of the distilled version DeiT is RegNetY-16GF.
+## Results and models

 ### ImageNet-1k

-|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
-|:---------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
-| DeiT-tiny\* | 5.72 | 1.08 | 72.13 | 91.13 | [config](configs/deit/deit-tiny_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_3rdparty_pt-4xb256_in1k_20211124-e930093b.pth) |
-| DeiT-tiny distilled\* | 5.72 | 1.08 | 74.51 | 91.90 | [config](configs/deit/deit-tiny-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny-distilled_3rdparty_pt-4xb256_in1k_20211216-c429839a.pth) |
-| DeiT-small\* | 22.05 | 4.24 | 79.83 | 94.95 | [config](configs/deit/deit-small_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_3rdparty_pt-4xb256_in1k_20211124-ffe94edd.pth) |
-| DeiT-small distilled\* | 22.05 | 4.24 | 81.17 | 95.40 | [config](configs/deit/deit-small-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small-distilled_3rdparty_pt-4xb256_in1k_20211216-4de1d725.pth) |
-| DeiT-base\* | 86.57 | 16.86 | 81.79 | 95.59 | [config](configs/deit/deit-base_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_pt-16xb64_in1k_20211124-6f40c188.pth) |
-| DeiT-base distilled\* | 86.57 | 16.86 | 83.33 | 96.49 | [config](configs/deit/deit-base-distilled_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_pt-16xb64_in1k_20211216-42891296.pth) |
+The teacher of the distilled version DeiT is RegNetY-16GF.

-*Models with \* are converted from other repos.*
+|         Model         |    Pretrain  | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:---------------------:|:------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
+| DeiT-tiny\*           | From scratch | 5.72      | 1.08     | 72.13     | 91.13     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-tiny_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_3rdparty_pt-4xb256_in1k_20211124-e930093b.pth) |
+| DeiT-tiny distilled\* | From scratch | 5.72      | 1.08     | 74.51     | 91.90     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-tiny-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny-distilled_3rdparty_pt-4xb256_in1k_20211216-c429839a.pth) |
+| DeiT-small\*          | From scratch | 22.05     | 4.24     | 79.83     | 94.95     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-small_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_3rdparty_pt-4xb256_in1k_20211124-ffe94edd.pth) |
+| DeiT-small distilled\*| From scratch | 22.05     | 4.24     | 81.17     | 95.40     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-small-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small-distilled_3rdparty_pt-4xb256_in1k_20211216-4de1d725.pth) |
+| DeiT-base\*           | From scratch | 86.57     | 16.86    | 81.79     | 95.59     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-base_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_pt-16xb64_in1k_20211124-6f40c188.pth) |
+| DeiT-base distilled\* | From scratch | 86.57     | 16.86    | 83.33     | 96.49     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-base-distilled_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_pt-16xb64_in1k_20211216-42891296.pth) |
+| DeiT-base 384px\*     | ImageNet-1k  | 86.86     | 49.37    | 83.04     | 96.31     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-base_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_ft-16xb32_in1k-384px_20211124-822d02f2.pth) |
+| DeiT-base distilled 384px\* | ImageNet-1k | 86.86 | 49.37   | 85.55     | 97.35     | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit/deit-base-distilled_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_ft-16xb32_in1k-384px_20211216-e48d6000.pth) |

-## Fine-tuned models
-
-The fine-tuned models are converted from the [official repo](https://github.com/facebookresearch/deit).
-
-### ImageNet-1k
-
-|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
-|:---------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
-| DeiT-base 384px\* | 86.86 | 49.37 | 83.04 | 96.31 | [config](configs/deit/deit-base_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_ft-16xb32_in1k-384px_20211124-822d02f2.pth) |
-| DeiT-base distilled 384px\* | 86.86 | 49.37 | 85.55 | 97.35 | [config](configs/deit/deit-base-distilled_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_ft-16xb32_in1k-384px_20211216-e48d6000.pth) |
-
-*Models with \* are converted from other repos.*
+*Models with \* are converted from the [official repo](https://github.com/facebookresearch/deit). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*

 ```{warning}
 MMClassification doesn't support training the distilled version DeiT.
--- a/configs/mlp_mixer/README.md
+++ b/configs/mlp_mixer/README.md
@ -23,15 +23,13 @@ Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Re
 }
 ```

-## Pretrain model
-
-The pre-trained modles are converted from [timm](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/mlp_mixer.py).
+## Results and models

 ### ImageNet-1k

 |      Model     | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:--------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
-|  Mixer-B/16\*  |  59.88   |  12.61    | 76.68     | 92.25     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-base-p16_64xb64_in1k.py)  | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-base-p16_3rdparty_64xb64_in1k_20211124-1377e3e0.pth)|
-|  Mixer-L/16\*  |  208.2   |  44.57    | 72.34     | 88.02     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-large-p16_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-large-p16_3rdparty_64xb64_in1k_20211124-5a2519d2.pth)|
+|  Mixer-B/16\*  |  59.88   |  12.61    | 76.68     | 92.25     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-base-p16_64xb64_in1k.py)  | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-base-p16_3rdparty_64xb64_in1k_20211124-1377e3e0.pth) |
+|  Mixer-L/16\*  |  208.2   |  44.57    | 72.34     | 88.02     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-large-p16_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-large-p16_3rdparty_64xb64_in1k_20211124-5a2519d2.pth) |

-*Models with \* are converted from other repos.*
+*Models with \* are converted from [timm](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/mlp_mixer.py). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/mobilenet_v2/README.md
+++ b/configs/mobilenet_v2/README.md
@ -29,7 +29,7 @@ The MobileNetV2 architecture is based on an inverted residual structure where th

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/mobilenet_v3/README.md
+++ b/configs/mobilenet_v3/README.md
@ -22,17 +22,13 @@ We present the next generation of MobileNets based on a combination of complemen
 }
 ```

-## Pretrain model
-
-The pre-trained modles are converted from [torchvision](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv3.html).
-
-### ImageNet
-
-|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Download |
-|:---------------------:|:---------:|:--------:|:---------:|:---------:|:--------:|
-| MobileNetV3-Small     |   2.54    |   0.06   |   67.66   |   87.41   | [model](https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth)|
-| MobileNetV3-Large     |   5.48    |   0.23   |   74.04   |   91.34   | [model](https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth)|
-
 ## Results and models

-Waiting for adding.
+### ImageNet-1k
+
+|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:---------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
+| MobileNetV3-Small\*   |   2.54    |   0.06   |   67.66   |   87.41   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mobilenet_v3/mobilenet-v3-small_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth) |
+| MobileNetV3-Large\*   |   5.48    |   0.23   |   74.04   |   91.34   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mobilenet_v3/mobilenet-v3-large_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_large-3ea3c186.pth) |
+
+*Models with \* are converted from [torchvision](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv3.html). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/regnet/README.md
+++ b/configs/regnet/README.md
@ -23,23 +23,19 @@ In this work, we present a new network design paradigm. Our goal is to help adva
 }
 ```

-## Pretrain model
-
-The pre-trained modles are converted from [model zoo of pycls](https://github.com/facebookresearch/pycls/blob/master/MODEL_ZOO.md).
-
-### ImageNet
-
-|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Download |
-|:---------------------:|:---------:|:--------:|:---------:|:---------:|:--------:|
-| RegNetX-400MF         |   5.16    |   0.41   |   72.55   |   90.91   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-400MF-0db9f35c.pth)|
-| RegNetX-800MF         |   7.26    |   0.81   |   75.21   |   92.37   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-800MF-4f9d1e8a.pth)|
-| RegNetX-1.6GF         |   9.19    |   1.63   |   77.04   |   93.51   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-1.6GF-cfb32375.pth)|
-| RegNetX-3.2GF         |   15.3    |   3.21   |   78.26   |   94.20   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-3.2GF-82c43fd5.pth)|
-| RegNetX-4.0GF         |   22.12   |   4.0    |   78.72   |   94.22   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-4.0GF-ef8bb32c.pth)|
-| RegNetX-6.4GF         |   26.21   |   6.51   |   79.22   |   94.61   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-6.4GF-6888c0ea.pth)|
-| RegNetX-8.0GF         |   39.57   |   8.03   |   79.31   |   94.57   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-8.0GF-cb4c77ec.pth)|
-| RegNetX-12GF          |   46.11   |   12.15  |   79.91   |   94.78   | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-12GF-0574538f.pth)|
-
 ## Results and models

-Waiting for adding.
+### ImageNet-1k
+
+|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:---------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
+| RegNetX-400MF\*       |   5.16    |   0.41   |   72.55   |   90.91   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-400mf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-400MF-0db9f35c.pth) |
+| RegNetX-800MF\*       |   7.26    |   0.81   |   75.21   |   92.37   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-800mf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-800MF-4f9d1e8a.pth) |
+| RegNetX-1.6GF\*       |   9.19    |   1.63   |   77.04   |   93.51   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-1.6gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-1.6GF-cfb32375.pth) |
+| RegNetX-3.2GF\*       |   15.3    |   3.21   |   78.26   |   94.20   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-3.2gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-3.2GF-82c43fd5.pth) |
+| RegNetX-4.0GF\*       |   22.12   |   4.0    |   78.72   |   94.22   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-4.0gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-4.0GF-ef8bb32c.pth) |
+| RegNetX-6.4GF\*       |   26.21   |   6.51   |   79.22   |   94.61   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-6.4gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-6.4GF-6888c0ea.pth) |
+| RegNetX-8.0GF\*       |   39.57   |   8.03   |   79.31   |   94.57   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-8.0gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-8.0GF-cb4c77ec.pth) |
+| RegNetX-12GF\*        |   46.11   |   12.15  |   79.91   |   94.78   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/regnet/regnetx-12gf_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/regnet/convert/RegNetX-12GF-0574538f.pth) |
+
+*Models with \* are converted from [pycls](https://github.com/facebookresearch/pycls/blob/master/MODEL_ZOO.md). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/repvgg/README.md
+++ b/configs/repvgg/README.md
@ -22,7 +22,9 @@ We present a simple but powerful architecture of convolutional neural network, w
 }
 ```

-## Pretrain model
+## Results and models
+
+### ImageNet-1k

 |    Model    | Epochs |             Params(M)             |            Flops(G)             | Top-1 (%) | Top-5 (%) |                            Config                            |                           Download                           |
 | :---------: | :----: | :-------------------------------: | :-----------------------------: | :-------: | :-------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
@ -39,7 +41,7 @@ We present a simple but powerful architecture of convolutional neural network, w
 | RepVGG-B3g4\* |  200   |  83.83 (train) \| 75.63 (deploy)  | 17.9 (train) \| 16.08 (deploy)  |   80.22   |   95.10   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) \|[config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B3g4_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-4e54846a.pth) |
 | RepVGG-D2se\* |  200   |  133.33 (train) \| 120.39 (deploy)  | 36.56 (train) \| 32.85 (deploy)  |   81.81   |   95.94   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-D2se_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) \|[config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-D2se_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-D2se_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-cf3139b7.pth) |

-*Models with \* are converted from other repos.*
+*Models with \* are converted from the [official repo](https://github.com/DingXiaoH/RepVGG). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*

 ## Reparameterize RepVGG

--- a/configs/res2net/README.md
+++ b/configs/res2net/README.md
@ -22,16 +22,14 @@ Representing features at multiple scales is of great importance for numerous vis
 }
 ```

-## Pretrain model
+## Results and models

-The pre-trained models are converted from [official repo](https://github.com/Res2Net/Res2Net-PretrainedModels).
+### ImageNet-1k

-### ImageNet 1k
+|        Model          | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:---------------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:------:|:--------:|
+|  Res2Net-50-14w-8s\*  |   224x224   |   25.06   |    4.22   |   78.14   |   93.85   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w14-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w14-s8_3rdparty_8xb32_in1k_20210927-bc967bf1.pth) &#124; [log]()|
+|  Res2Net-50-26w-8s\*  |   224x224   |   48.40   |    8.39   |   79.20   |   94.36   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w26-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w26-s8_3rdparty_8xb32_in1k_20210927-f547a94b.pth) &#124; [log]()|
+|  Res2Net-101-26w-4s\* |   224x224   |   45.21   |    8.12   |   79.19   |   94.44   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net101-w26-s4_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net101-w26-s4_3rdparty_8xb32_in1k_20210927-870b6c36.pth) &#124; [log]()|

-|        Model          | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) | Download |
-|:---------------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:--------:|
-|  Res2Net-50-14w-8s\*  |   224x224   |   25.06   |    4.22   |   78.14   |   93.85   | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w14-s8_3rdparty_8xb32_in1k_20210927-bc967bf1.pth)|
-|  Res2Net-50-26w-8s\*  |   224x224   |   48.40   |    8.39   |   79.20   |   94.36   | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w26-s8_3rdparty_8xb32_in1k_20210927-f547a94b.pth)|
-|  Res2Net-101-26w-4s\* |   224x224   |   45.21   |    8.12   |   79.19   |   94.44   | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net101-w26-s4_3rdparty_8xb32_in1k_20210927-870b6c36.pth)|
-
-*Models with \* are converted from other repos.*
+*Models with \* are converted from the [official repo](https://github.com/Res2Net/Res2Net-PretrainedModels). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/resnet/README.md
+++ b/configs/resnet/README.md
@ -42,7 +42,7 @@ The depth of representations is of central importance for many visual recognitio
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
 | ResNet-50-b16x8 | 23.71 | 1.31 | 79.90 | 95.19 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet50_8xb16_cifar100.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_b16x8_cifar100_20210528-67b58a1b.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_b16x8_cifar100_20210528-67b58a1b.log.json) |

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/resnext/README.md
+++ b/configs/resnext/README.md
@ -24,7 +24,7 @@ We present a simple, highly modularized network architecture for image classific

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/seresnet/README.md
+++ b/configs/seresnet/README.md
@ -24,7 +24,7 @@ The central building block of convolutional neural networks (CNNs) is the convol

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/shufflenet_v1/README.md
+++ b/configs/shufflenet_v1/README.md
@ -24,7 +24,7 @@ We introduce an extremely computation-efficient CNN architecture named ShuffleNe

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/shufflenet_v2/README.md
+++ b/configs/shufflenet_v2/README.md
@ -24,7 +24,7 @@ Currently, the neural network architecture design is mostly guided by the *indir

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/swin_transformer/README.md
+++ b/configs/swin_transformer/README.md
@ -12,6 +12,7 @@ This paper presents a new vision Transformer, called Swin Transformer, that capa
 </div>

 ## Citation
+
 ```latex
@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
@ -21,29 +22,32 @@ This paper presents a new vision Transformer, called Swin Transformer, that capa
 }
 ```

-## Pretrain model
-
-The pre-trained modles are converted from [model zoo of Swin Transformer](https://github.com/microsoft/Swin-Transformer#main-results-on-imagenet-with-pretrained-models).
-
-### ImageNet 1k
-
-|   Model   |  Pretrain    | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) | Download |
-|:---------:|:------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:--------:|
-|  Swin-T   | ImageNet-1k  |   224x224   |   28.29   |    4.36   |   81.18   |   95.52   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_tiny_patch4_window7_224-160bb0a5.pth)|
-|  Swin-S   | ImageNet-1k  |   224x224   |   49.61   |    8.52   |   83.21   |   96.25   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_small_patch4_window7_224-cc7a01c9.pth)|
-|  Swin-B   | ImageNet-1k  |   224x224   |   87.77   |   15.14   |   83.42   |   96.44   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window7_224-4670dd19.pth)|
-|  Swin-B   | ImageNet-1k  |   384x384   |   87.90   |   44.49   |   84.49   |   96.95   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window12_384-02c598a4.pth)|
-|  Swin-B   | ImageNet-22k |   224x224   |   87.77   |   15.14   |   85.16   |   97.50   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window7_224_22kto1k-f967f799.pth)|
-|  Swin-B   | ImageNet-22k |   384x384   |   87.90   |   44.49   |   86.44   |   98.05   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window12_384_22kto1k-d59b0d1d.pth)|
-|  Swin-L   | ImageNet-22k |   224x224   |  196.53   |   34.04   |   86.24   |   97.88   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_large_patch4_window7_224_22kto1k-5f0996db.pth)|
-|  Swin-L   | ImageNet-22k |   384x384   |  196.74   |  100.04   |   87.25   |   98.25   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_large_patch4_window12_384_22kto1k-0a40944b.pth)|
-
-
 ## Results and models

-### ImageNet 1k
-|   Model   |  Pretrain    | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) |   Config   | Download |
-|:---------:|:------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:----------:|:--------:|
-|  Swin-T   | ImageNet-1k  |   224x224   |   28.29   |    4.36   |   81.18   |   95.61   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-tiny_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925.log.json)|
-|  Swin-S   | ImageNet-1k  |   224x224   |   49.61   |    8.52   |   83.02   |   96.29   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-small_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219-7f9d988b.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219.log.json)|
-|  Swin-B   | ImageNet-1k  |   224x224   |   87.77   |   15.14   |   83.36   |   96.44   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742-93230b0d.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742.log.json)|
+### ImageNet-21k
+
+The pre-trained models on ImageNet-21k are used to fine-tune, and therefore don't have evaluation results.
+
+|   Model   | resolution  | Params(M) |  Flops(G) | Download |
+|:---------:|:-----------:|:---------:|:---------:|:--------:|
+|  Swin-B   |   224x224   |   86.74   |   15.14   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin-base_3rdparty_in21k.pth)|
+|  Swin-B   |   384x384   |   86.88   |   44.49   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin-base_3rdparty_in21k-384px.pth)|
+|  Swin-L   |   224x224   |  195.00   |   34.04   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin-large_3rdparty_in21k.pth)|
+|  Swin-L   |   384x384   |  195.20   |  100.04   | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin-base_3rdparty_in21k-384px.pth)|
+
+### ImageNet-1k
+
+|   Model   |   Pretrain   | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:---------:|:------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:------:|:--------:|
+|  Swin-T   | From scratch |   224x224   |   28.29   |    4.36   |   81.18   |   95.61   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-tiny_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925.log.json)|
+|  Swin-S   | From scratch |   224x224   |   49.61   |    8.52   |   83.02   |   96.29   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-small_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219-7f9d988b.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219.log.json)|
+|  Swin-B   | From scratch |   224x224   |   87.77   |   15.14   |   83.36   |   96.44   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742-93230b0d.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742.log.json)|
+|  Swin-S\* | From scratch |   224x224   |   49.61   |    8.52   |   83.21   |   96.25   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-small_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_small_patch4_window7_224-cc7a01c9.pth) |
+|  Swin-B\* | From scratch |   224x224   |   87.77   |   15.14   |   83.42   |   96.44   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-base_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window7_224-4670dd19.pth)|
+|  Swin-B\* | From scratch |   384x384   |   87.90   |   44.49   |   84.49   |   96.95   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-base_16xb64_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window12_384-02c598a4.pth)|
+|  Swin-B\* | ImageNet-21k |   224x224   |   87.77   |   15.14   |   85.16   |   97.50   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-base_16xb64_in1k.py)| [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window7_224_22kto1k-f967f799.pth)|
+|  Swin-B\* | ImageNet-21k |   384x384   |   87.90   |   44.49   |   86.44   |   98.05   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-base_16xb64_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_base_patch4_window12_384_22kto1k-d59b0d1d.pth)|
+|  Swin-L\* | ImageNet-21k |   224x224   |  196.53   |   34.04   |   86.24   |   97.88   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-large_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_large_patch4_window7_224_22kto1k-5f0996db.pth)|
+|  Swin-L\* | ImageNet-21k |   384x384   |  196.74   |  100.04   |   87.25   |   98.25   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-large_16xb64_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_large_patch4_window12_384_22kto1k-0a40944b.pth)|
+
+*Models with \* are converted from the [official repo](https://github.com/microsoft/Swin-Transformer#main-results-on-imagenet-with-pretrained-models). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/t2t_vit/README.md
+++ b/configs/t2t_vit/README.md
@ -21,20 +21,14 @@ Transformers, which are popular for language modeling, have been explored for so
 }
 ```

-## Pretrain model
-
-The pre-trained models are converted from [official repo](https://github.com/yitu-opensource/T2T-ViT/tree/main#2-t2t-vit-models).
+## Results and models

 ### ImageNet-1k

 |      Model     | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:--------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
-| T2T-ViT_t-14\* |   21.47   |  4.34    | 81.69     | 95.85     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth)  &#124; [log]()|
-| T2T-ViT_t-19\* |   39.08   |  7.80    | 82.43     | 96.08     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_3rdparty_8xb64_in1k_20210928-7f1478d5.pth)  &#124; [log]()|
-| T2T-ViT_t-24\* |   64.00   | 12.69    | 82.55     | 96.06     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_3rdparty_8xb64_in1k_20210928-fe95a61b.pth)  &#124; [log]()|
+| T2T-ViT_t-14   |   21.47   |  4.34    | 81.83     | 95.84     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_8xb64_in1k_20211220-f7378dd5.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_8xb64_in1k_20211220-f7378dd5.log.json)|
+| T2T-ViT_t-19   |   39.08   |  7.80    | 82.63     | 96.18     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_8xb64_in1k_20211214-7f5e3aaf.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_8xb64_in1k_20211214-7f5e3aaf.log.json)|
+| T2T-ViT_t-24   |   64.00   | 12.69    | 82.71     | 96.09     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_8xb64_in1k_20211214-b2a68ae3.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_8xb64_in1k_20211214-b2a68ae3.log.json)|

-*Models with \* are converted from other repos.*
-
-## Results and models
-
-Waiting for adding.
+*In consistent with the [official repo](https://github.com/yitu-opensource/T2T-ViT), we adopt the best checkpoints during training.*
--- a/configs/t2t_vit/metafile.yml
+++ b/configs/t2t_vit/metafile.yml
@ -17,7 +17,7 @@ Collections:
      Version: v0.17.0

 Models:
-  - Name: t2t-vit-t-14_3rdparty_8xb64_in1k
+  - Name: t2t-vit-t-14_8xb64_in1k
    Metadata:
      FLOPs: 4340000000
      Parameters: 21470000
@ -25,15 +25,12 @@ Models:
    Results:
      - Dataset: ImageNet-1k
        Metrics:
-          Top 1 Accuracy: 81.69
-          Top 5 Accuracy: 95.85
+          Top 1 Accuracy: 81.83
+          Top 5 Accuracy: 95.84
        Task: Image Classification
-    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth
-    Converted From:
-      Weights: https://github.com/yitu-opensource/T2T-ViT/releases/download/main/81.7_T2T_ViTt_14.pth.tar
-      Code: https://github.com/yitu-opensource/T2T-ViT/blob/main/models/t2t_vit.py#L243
+    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_8xb64_in1k_20211220-f7378dd5.pth
    Config: configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py
-  - Name: t2t-vit-t-19_3rdparty_8xb64_in1k
+  - Name: t2t-vit-t-19_8xb64_in1k
    Metadata:
      FLOPs: 7800000000
      Parameters: 39080000
@ -41,15 +38,12 @@ Models:
    Results:
      - Dataset: ImageNet-1k
        Metrics:
-          Top 1 Accuracy: 82.43
-          Top 5 Accuracy: 96.08
+          Top 1 Accuracy: 82.63
+          Top 5 Accuracy: 96.18
        Task: Image Classification
-    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_3rdparty_8xb64_in1k_20210928-7f1478d5.pth
-    Converted From:
-      Weights: https://github.com/yitu-opensource/T2T-ViT/releases/download/main/82.4_T2T_ViTt_19.pth.tar
-      Code: https://github.com/yitu-opensource/T2T-ViT/blob/main/models/t2t_vit.py#L254
+    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_8xb64_in1k_20211214-7f5e3aaf.pth
    Config: configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py
-  - Name: t2t-vit-t-24_3rdparty_8xb64_in1k
+  - Name: t2t-vit-t-24_8xb64_in1k
    Metadata:
      FLOPs: 12690000000
      Parameters: 64000000
@ -57,11 +51,8 @@ Models:
    Results:
      - Dataset: ImageNet-1k
        Metrics:
-          Top 1 Accuracy: 82.55
-          Top 5 Accuracy: 96.06
+          Top 1 Accuracy: 82.71
+          Top 5 Accuracy: 96.09
        Task: Image Classification
-    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_3rdparty_8xb64_in1k_20210928-fe95a61b.pth
-    Converted From:
-      Weights: https://github.com/yitu-opensource/T2T-ViT/releases/download/main/82.6_T2T_ViTt_24.pth.tar
-      Code: https://github.com/yitu-opensource/T2T-ViT/blob/main/models/t2t_vit.py#L265
+    Weights: https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_8xb64_in1k_20211214-b2a68ae3.pth
    Config: configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py
--- a/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py
+++ b/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py
@ -6,8 +6,9 @@ _base_ = [

 # optimizer
 paramwise_cfg = dict(
+    norm_decay_mult=0.0,
    bias_decay_mult=0.0,
-    custom_keys={'.backbone.cls_token': dict(decay_mult=0.0)},
+    custom_keys={'cls_token': dict(decay_mult=0.0)},
 )
 optimizer = dict(
    type='AdamW',
@ -21,11 +22,14 @@ optimizer_config = dict(grad_clip=None)
 # FIXME: lr in the first 300 epochs conforms to the CosineAnnealing and
 # the lr in the last 10 epoch equals to min_lr
 lr_config = dict(
-    policy='CosineAnnealing',
+    policy='CosineAnnealingCooldown',
    min_lr=1e-5,
+    cool_down_time=10,
+    cool_down_ratio=0.1,
    by_epoch=True,
    warmup_by_epoch=True,
    warmup='linear',
    warmup_iters=10,
    warmup_ratio=1e-6)
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
 runner = dict(type='EpochBasedRunner', max_epochs=310)
--- a/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py
+++ b/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py
@ -6,8 +6,9 @@ _base_ = [

 # optimizer
 paramwise_cfg = dict(
+    norm_decay_mult=0.0,
    bias_decay_mult=0.0,
-    custom_keys={'.backbone.cls_token': dict(decay_mult=0.0)},
+    custom_keys={'cls_token': dict(decay_mult=0.0)},
 )
 optimizer = dict(
    type='AdamW',
@ -21,11 +22,14 @@ optimizer_config = dict(grad_clip=None)
 # FIXME: lr in the first 300 epochs conforms to the CosineAnnealing and
 # the lr in the last 10 epoch equals to min_lr
 lr_config = dict(
-    policy='CosineAnnealing',
+    policy='CosineAnnealingCooldown',
    min_lr=1e-5,
+    cool_down_time=10,
+    cool_down_ratio=0.1,
    by_epoch=True,
    warmup_by_epoch=True,
    warmup='linear',
    warmup_iters=10,
    warmup_ratio=1e-6)
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
 runner = dict(type='EpochBasedRunner', max_epochs=310)
--- a/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py
+++ b/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py
@ -6,8 +6,9 @@ _base_ = [

 # optimizer
 paramwise_cfg = dict(
+    norm_decay_mult=0.0,
    bias_decay_mult=0.0,
-    custom_keys={'.backbone.cls_token': dict(decay_mult=0.0)},
+    custom_keys={'cls_token': dict(decay_mult=0.0)},
 )
 optimizer = dict(
    type='AdamW',
@ -21,11 +22,14 @@ optimizer_config = dict(grad_clip=None)
 # FIXME: lr in the first 300 epochs conforms to the CosineAnnealing and
 # the lr in the last 10 epoch equals to min_lr
 lr_config = dict(
-    policy='CosineAnnealing',
+    policy='CosineAnnealingCooldown',
    min_lr=1e-5,
+    cool_down_time=10,
+    cool_down_ratio=0.1,
    by_epoch=True,
    warmup_by_epoch=True,
    warmup='linear',
    warmup_iters=10,
    warmup_ratio=1e-6)
+custom_hooks = [dict(type='EMAHook', momentum=4e-5, priority='ABOVE_NORMAL')]
 runner = dict(type='EpochBasedRunner', max_epochs=310)
--- a/configs/tnt/README.md
+++ b/configs/tnt/README.md
@ -22,18 +22,12 @@ Transformer is a new kind of neural architecture which encodes the input data as
 }
 ```

-## Pretrain model
-
-The pre-trained modles are converted from [timm](https://github.com/rwightman/pytorch-image-models/).
+## Results and models

 ### ImageNet

-|         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Download |
-|:---------------------:|:---------:|:--------:|:---------:|:---------:|:--------:|
-| Transformer in Transformer small\* |   23.76  |  3.36 | 81.52 | 95.73 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/tnt/tnt-s-p16_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/tnt/tnt-small-p16_3rdparty_in1k_20210903-c56ee7df.pth)  &#124; [log]()|
+|              Model                 | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
+|:----------------------------------:|:---------:|:--------:|:---------:|:---------:|:------:|:--------:|
+| Transformer in Transformer small\* |   23.76   |  3.36    | 81.52     |   95.73   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/tnt/tnt-s-p16_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/tnt/tnt-small-p16_3rdparty_in1k_20210903-c56ee7df.pth) |

-*Models with \* are converted from other repos.*
-
-## Results and models
-
-Waiting for adding.
+*Models with \* are converted from [timm](https://github.com/rwightman/pytorch-image-models/). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/configs/vgg/README.md
+++ b/configs/vgg/README.md
@ -24,7 +24,7 @@ In this work we investigate the effect of the convolutional network depth on its

 ## Results and models

-### ImageNet
+### ImageNet-1k

 |         Model         | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---------------------:|:---------:|:--------:|:---------:|:---------:|:---------:|:--------:|
--- a/configs/vision_transformer/README.md
+++ b/configs/vision_transformer/README.md
@ -23,36 +23,32 @@ While the Transformer architecture has become the de-facto standard for natural
 }
 ```

+## Results and models
+
 The training step of Vision Transformers is divided into two steps. The first
 step is training the model on a large dataset, like ImageNet-21k, and get the
-pretrain model. And the second step is training the model on the target dataset,
-like ImageNet-1k, and get the finetune model. Here, we provide both pretrain
-models and finetune models.
+pre-trained model. And the second step is training the model on the target
+dataset, like ImageNet-1k, and get the fine-tuned model. Here, we provide both
+pre-trained models and fine-tuned models.

-## Pretrain model
+### ImageNet-21k

-The pre-trained models are converted from [model zoo of Google Research](https://github.com/google-research/vision_transformer#available-vit-models).
+The pre-trained models on ImageNet-21k are used to fine-tune, and therefore don't have evaluation results.

-### ImageNet 21k
+|   Model    | resolution  | Params(M) |  Flops(G) | Download |
+|:----------:|:-----------:|:---------:|:---------:|:--------:|
+|  ViT-B16\* |   224x224   |   86.86   |   33.03   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth)|
+|  ViT-B32\* |   224x224   |   88.30   |    8.56   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p32_3rdparty_pt-64xb64_in1k-224_20210928-eee25dd4.pth)|
+|  ViT-L16\* |   224x224   |  304.72   |  116.68   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-large-p16_3rdparty_pt-64xb64_in1k-224_20210928-0001f9a1.pth)|

-|   Model    | Params(M) |  Flops(G) | Download |
-|:----------:|:---------:|:---------:|:--------:|
-|  ViT-B16\* |   86.86   |   33.03   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth)|
-|  ViT-B32\* |   88.30   |    8.56   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p32_3rdparty_pt-64xb64_in1k-224_20210928-eee25dd4.pth)|
-|  ViT-L16\* |  304.72   |  116.68   | [model](https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-large-p16_3rdparty_pt-64xb64_in1k-224_20210928-0001f9a1.pth)|
+*Models with \* are converted from the [official repo](https://github.com/google-research/vision_transformer#available-vit-models).*

-*Models with \* are converted from other repos.*
+### ImageNet-1k

-
-## Finetune model
-
-The finetune models are converted from [model zoo of Google Research](https://github.com/google-research/vision_transformer#available-vit-models).
-
-### ImageNet 1k
 |    Model   |  Pretrain    | resolution  | Params(M) |  Flops(G) | Top-1 (%) | Top-5 (%) |   Config   | Download |
 |:----------:|:------------:|:-----------:|:---------:|:---------:|:---------:|:---------:|:----------:|:--------:|
 |  ViT-B16\* | ImageNet-21k |   384x384   |   86.86   |   33.03   |   85.43   |   97.77   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth)|
 |  ViT-B32\* | ImageNet-21k |   384x384   |   88.30   |    8.56   |   84.01   |   97.08   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p32_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p32_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-9cea8599.pth)|
 |  ViT-L16\* | ImageNet-21k |   384x384   |  304.72   |  116.68   |   85.63   |   97.63   | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-large-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-large-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-b20ba619.pth)|

-*Models with \* are converted from other repos.*
+*Models with \* are converted from the [official repo](https://github.com/google-research/vision_transformer#available-vit-models). The config files of these models are only for validation. We don't ensure these config files' training accuracy and welcome you to contribute your reproduction results.*
--- a/docs/en/model_zoo.md
+++ b/docs/en/model_zoo.md
@ -15,30 +15,30 @@ The ResNet family models below are trained by standard data augmentations, i.e.,
 | VGG-13-BN | 133.05 | 11.36 | 72.15 | 90.71 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vgg/vgg13bn_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_bn_batch256_imagenet_20210207-1a8b7864.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_bn_batch256_imagenet_20210207-1a8b7864.log.json) |
 | VGG-16-BN | 138.37 | 15.53 | 73.72 | 91.68 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vgg/vgg16_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_bn_batch256_imagenet_20210208-7e55cd29.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_bn_batch256_imagenet_20210208-7e55cd29.log.json) |
 | VGG-19-BN | 143.68 | 19.7 | 74.70 | 92.24 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vgg/vgg19bn_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_bn_batch256_imagenet_20210208-da620c4f.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_bn_batch256_imagenet_20210208-da620c4f.log.json)|
-|  RepVGG-A0\*  |   9.11（train) &#124; 8.31 (deploy)   |  1.52 (train) &#124; 1.36 (deploy)  |   72.41   |   90.50   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A0_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A0_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A0_3rdparty_4xb64-coslr-120e_in1k_20210909-883ab98c.pth)  &#124; [log]() |
-|  RepVGG-A1\*  |  14.09 (train) &#124; 12.79 (deploy)  |  2.64 (train) &#124; 2.37 (deploy)  |   74.47   |   91.85   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A1_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A1_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A1_3rdparty_4xb64-coslr-120e_in1k_20210909-24003a24.pth)  &#124; [log]() |
-|  RepVGG-A2\*  |  28.21 (train) &#124; 25.5 (deploy)   |  5.7 (train)  &#124; 5.12 (deploy)  |   76.48   |   93.01   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A2_3rdparty_4xb64-coslr-120e_in1k_20210909-97d7695a.pth)  &#124; [log]()  |
-|  RepVGG-B0\*  |  15.82 (train) &#124; 14.34 (deploy)  |  3.42 (train) &#124; 3.06 (deploy)  |   75.14   |   92.42   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B0_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B0_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B0_3rdparty_4xb64-coslr-120e_in1k_20210909-446375f4.pth)  &#124; [log]() |
-|  RepVGG-B1\*  |  57.42 (train) &#124; 51.83 (deploy)  | 13.16 (train) &#124; 11.82 (deploy) |   78.37   |   94.11  | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1_3rdparty_4xb64-coslr-120e_in1k_20210909-750cdf67.pth)  &#124; [log]() |
-| RepVGG-B1g2\* |  45.78 (train) &#124; 41.36 (deploy)  |  9.82 (train) &#124; 8.82 (deploy)  |   77.79   |   93.88   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1g2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1g2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g2_3rdparty_4xb64-coslr-120e_in1k_20210909-344f6422.pth)  &#124; [log]() |
-| RepVGG-B1g4\* |  39.97 (train) &#124; 36.13 (deploy)  |  8.15 (train) &#124; 7.32 (deploy)  |   77.58   |   93.84   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1g4_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1g4_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g4_3rdparty_4xb64-coslr-120e_in1k_20210909-d4c1a642.pth)  &#124; [log]() |
-|  RepVGG-B2\*  |  89.02 (train) &#124; 80.32 (deploy)  | 20.46 (train) &#124; 18.39 (deploy) |   78.78   |   94.42   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2_3rdparty_4xb64-coslr-120e_in1k_20210909-bd6b937c.pth)  &#124; [log]() |
-| RepVGG-B2g4\* |  61.76 (train) &#124; 55.78 (deploy)  | 12.63 (train) &#124; 11.34 (deploy) |   79.38   |   94.68   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B2g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B2g4_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-7b7955f0.pth)  &#124; [log]() |
-|  RepVGG-B3\*  | 123.09 (train) &#124; 110.96 (deploy) | 29.17 (train) &#124; 26.22 (deploy) |   80.52   |   95.26   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B3_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B3_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-dda968bf.pth)  &#124; [log]() |
-| RepVGG-B3g4\* |  83.83 (train) &#124; 75.63 (deploy)  | 17.9 (train) &#124; 16.08 (deploy)  |   80.22   |   95.10   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B3g4_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-4e54846a.pth)  &#124; [log]() |
-| RepVGG-D2se\* |  133.33 (train) &#124; 120.39 (deploy)  | 36.56 (train) &#124; 32.85 (deploy)  |   81.81   |   95.94   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-D2se_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-D2se_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-D2se_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-cf3139b7.pth)  &#124; [log]() |
+|  RepVGG-A0\*  |   9.11（train) &#124; 8.31 (deploy)   |  1.52 (train) &#124; 1.36 (deploy)  |   72.41   |   90.50   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A0_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A0_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A0_3rdparty_4xb64-coslr-120e_in1k_20210909-883ab98c.pth) |
+|  RepVGG-A1\*  |  14.09 (train) &#124; 12.79 (deploy)  |  2.64 (train) &#124; 2.37 (deploy)  |   74.47   |   91.85   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A1_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A1_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A1_3rdparty_4xb64-coslr-120e_in1k_20210909-24003a24.pth) |
+|  RepVGG-A2\*  |  28.21 (train) &#124; 25.5 (deploy)   |  5.7 (train)  &#124; 5.12 (deploy)  |   76.48   |   93.01   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-A2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-A2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-A2_3rdparty_4xb64-coslr-120e_in1k_20210909-97d7695a.pth) |
+|  RepVGG-B0\*  |  15.82 (train) &#124; 14.34 (deploy)  |  3.42 (train) &#124; 3.06 (deploy)  |   75.14   |   92.42   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B0_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B0_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B0_3rdparty_4xb64-coslr-120e_in1k_20210909-446375f4.pth) |
+|  RepVGG-B1\*  |  57.42 (train) &#124; 51.83 (deploy)  | 13.16 (train) &#124; 11.82 (deploy) |   78.37   |   94.11  | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1_3rdparty_4xb64-coslr-120e_in1k_20210909-750cdf67.pth) |
+| RepVGG-B1g2\* |  45.78 (train) &#124; 41.36 (deploy)  |  9.82 (train) &#124; 8.82 (deploy)  |   77.79   |   93.88   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1g2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1g2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g2_3rdparty_4xb64-coslr-120e_in1k_20210909-344f6422.pth) |
+| RepVGG-B1g4\* |  39.97 (train) &#124; 36.13 (deploy)  |  8.15 (train) &#124; 7.32 (deploy)  |   77.58   |   93.84   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B1g4_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B1g4_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B1g4_3rdparty_4xb64-coslr-120e_in1k_20210909-d4c1a642.pth) |
+|  RepVGG-B2\*  |  89.02 (train) &#124; 80.32 (deploy)  | 20.46 (train) &#124; 18.39 (deploy) |   78.78   |   94.42   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B2_4xb64-coslr-120e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B2_deploy_4xb64-coslr-120e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2_3rdparty_4xb64-coslr-120e_in1k_20210909-bd6b937c.pth) |
+| RepVGG-B2g4\* |  61.76 (train) &#124; 55.78 (deploy)  | 12.63 (train) &#124; 11.34 (deploy) |   79.38   |   94.68   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B2g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B2g4_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B2g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-7b7955f0.pth) |
+|  RepVGG-B3\*  | 123.09 (train) &#124; 110.96 (deploy) | 29.17 (train) &#124; 26.22 (deploy) |   80.52   |   95.26   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B3_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B3_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-dda968bf.pth) |
+| RepVGG-B3g4\* |  83.83 (train) &#124; 75.63 (deploy)  | 17.9 (train) &#124; 16.08 (deploy)  |   80.22   |   95.10   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-B3g4_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-B3g4_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-B3g4_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-4e54846a.pth) |
+| RepVGG-D2se\* |  133.33 (train) &#124; 120.39 (deploy)  | 36.56 (train) &#124; 32.85 (deploy)  |   81.81   |   95.94   | [config (train)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/repvgg-D2se_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) &#124; [config (deploy)](https://github.com/open-mmlab/mmclassification/blob/master/configs/repvgg/deploy/repvgg-D2se_deploy_4xb64-autoaug-lbs-mixup-coslr-200e_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/repvgg/repvgg-D2se_3rdparty_4xb64-autoaug-lbs-mixup-coslr-200e_in1k_20210909-cf3139b7.pth) |
 | ResNet-18             | 11.69     | 1.82     | 70.07 | 89.44 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet18_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_batch256_imagenet_20200708-34ab8f90.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_batch256_imagenet_20200708-34ab8f90.log.json) |
 | ResNet-34             | 21.8      | 3.68     | 73.85 | 91.53 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet34_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_batch256_imagenet_20200708-32ffb4f7.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_batch256_imagenet_20200708-32ffb4f7.log.json) |
 | ResNet-50             | 25.56     | 4.12     | 76.55 | 93.15 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet50_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.log.json) |
 | ResNet-101            | 44.55     | 7.85     | 78.18 | 94.03 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet101_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet101_batch256_imagenet_20200708-753f3608.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet101_batch256_imagenet_20200708-753f3608.log.json) |
 | ResNet-152            | 60.19     | 11.58    | 78.63 | 94.16 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet152_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnet152_batch256_imagenet_20200708-ec25b1f9.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnet152_batch256_imagenet_20200708-ec25b1f9.log.json) |
-| Res2Net-50-14w-8s\*   |   25.06   |   4.22   | 78.14 | 93.85 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w14-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w14-s8_3rdparty_8xb32_in1k_20210927-bc967bf1.pth)  &#124; [log]()|
-| Res2Net-50-26w-8s\*   |   48.40   |   8.39   | 79.20 | 94.36 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w26-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w26-s8_3rdparty_8xb32_in1k_20210927-f547a94b.pth)  &#124; [log]()|
-| Res2Net-101-26w-4s\*  |   45.21   |   8.12   | 79.19 | 94.44 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net101-w26-s4_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net101-w26-s4_3rdparty_8xb32_in1k_20210927-870b6c36.pth)  &#124; [log]()|
-| ResNeSt-50\*           | 27.48     | 5.41     | 81.13 | 95.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest50_32xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest50_imagenet_converted-1ebf0afe.pth) &#124; [log]() |
-| ResNeSt-101\*          | 48.28     | 10.27    | 82.32 | 96.24 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest101_32xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest101_imagenet_converted-032caa52.pth) &#124; [log]() |
-| ResNeSt-200\*          | 70.2      | 17.53    | 82.41 | 96.22 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest200_64xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest200_imagenet_converted-581a60f2.pth) &#124; [log]() |
-| ResNeSt-269\*          | 110.93    | 22.58    | 82.70 | 96.28 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest269_64xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest269_imagenet_converted-59930960.pth) &#124; [log]() |
+| Res2Net-50-14w-8s\*   |   25.06   |   4.22   | 78.14 | 93.85 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w14-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w14-s8_3rdparty_8xb32_in1k_20210927-bc967bf1.pth) |
+| Res2Net-50-26w-8s\*   |   48.40   |   8.39   | 79.20 | 94.36 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net50-w26-s8_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net50-w26-s8_3rdparty_8xb32_in1k_20210927-f547a94b.pth) |
+| Res2Net-101-26w-4s\*  |   45.21   |   8.12   | 79.19 | 94.44 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/res2net/res2net101-w26-s4_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/res2net/res2net101-w26-s4_3rdparty_8xb32_in1k_20210927-870b6c36.pth) |
+| ResNeSt-50\*           | 27.48     | 5.41     | 81.13 | 95.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest50_32xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest50_imagenet_converted-1ebf0afe.pth)|
+| ResNeSt-101\*          | 48.28     | 10.27    | 82.32 | 96.24 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest101_32xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest101_imagenet_converted-032caa52.pth)|
+| ResNeSt-200\*          | 70.2      | 17.53    | 82.41 | 96.22 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest200_64xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest200_imagenet_converted-581a60f2.pth)|
+| ResNeSt-269\*          | 110.93    | 22.58    | 82.70 | 96.28 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnest/resnest269_64xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnest/resnest269_imagenet_converted-59930960.pth)|
 | ResNetV1D-50          | 25.58     | 4.36     | 77.54  | 93.57 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnetv1d50_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d50_b32x8_imagenet_20210531-db14775a.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d50_b32x8_imagenet_20210531-db14775a.log.json) |
 | ResNetV1D-101         | 44.57     | 8.09     | 78.93 | 94.48 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnetv1d101_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d101_b32x8_imagenet_20210531-6e13bcd3.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d101_b32x8_imagenet_20210531-6e13bcd3.log.json) |
 | ResNetV1D-152         | 60.21     | 11.82    | 79.41 | 94.7 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnetv1d152_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d152_b32x8_imagenet_20210531-278cf22a.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d152_b32x8_imagenet_20210531-278cf22a.log.json) |
@ -51,32 +51,32 @@ The ResNet family models below are trained by standard data augmentations, i.e.,
 | ShuffleNetV1 1.0x (group=3)   | 1.87      | 0.146    | 68.13 | 87.81 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/shufflenet_v1/shufflenet-v1-1x_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/shufflenet_v1/shufflenet_v1_batch1024_imagenet_20200804-5d6cec73.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/shufflenet_v1/shufflenet_v1_batch1024_imagenet_20200804-5d6cec73.log.json) |
 | ShuffleNetV2 1.0x     | 2.28      | 0.149    | 69.55 | 88.92 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200812-5bf4721e.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200804-8860eec9.log.json) |
 | MobileNet V2          | 3.5       | 0.319    | 71.86 | 90.42 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth) &#124; [log](https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.log.json) |
-| ViT-B/16\*             | 86.86     | 33.03    | 85.43 | 97.77 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth) &#124; [log]() |
-| ViT-B/32\*             | 88.3      | 8.56     | 84.01 | 97.08 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p32_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p32_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-9cea8599.pth) &#124; [log]() |
-| ViT-L/16\*             | 304.72    | 116.68   | 85.63 | 97.63 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-large-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-large-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-b20ba619.pth) &#124; [log]() |
+| ViT-B/16\*             | 86.86     | 33.03    | 85.43 | 97.77 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-98e8652b.pth)|
+| ViT-B/32\*             | 88.3      | 8.56     | 84.01 | 97.08 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p32_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-base-p32_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-9cea8599.pth)|
+| ViT-L/16\*             | 304.72    | 116.68   | 85.63 | 97.63 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-large-p16_ft-64xb64_in1k-384.py) | [model](https://download.openmmlab.com/mmclassification/v0/vit/finetune/vit-large-p16_in21k-pre-3rdparty_ft-64xb64_in1k-384_20210928-b20ba619.pth)|
 | Swin-Transformer tiny |   28.29   |   4.36   | 81.18 | 95.61 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-tiny_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925.log.json)|
 | Swin-Transformer small|   49.61   |   8.52   | 83.02 | 96.29 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-small_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219-7f9d988b.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219.log.json)|
 | Swin-Transformer base |   87.77   |  15.14   | 83.36 | 96.44 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin_base_224_b16x64_300e_imagenet.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742-93230b0d.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742.log.json)|
-| Transformer in Transformer small\* |   23.76  |  3.36 | 81.52 | 95.73 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/tnt/tnt-s-p16_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/tnt/tnt-small-p16_3rdparty_in1k_20210903-c56ee7df.pth)  &#124; [log]()|
-| T2T-ViT_t-14\* |   21.47   |  4.34    | 81.69     | 95.85     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_3rdparty_8xb64_in1k_20210928-b7c09b62.pth)  &#124; [log]()|
-| T2T-ViT_t-19\* |   39.08   |  7.80    | 82.43     | 96.08     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_3rdparty_8xb64_in1k_20210928-7f1478d5.pth)  &#124; [log]()|
-| T2T-ViT_t-24\* |   64.00   | 12.69    | 82.55     | 96.06     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_3rdparty_8xb64_in1k_20210928-fe95a61b.pth)  &#124; [log]()|
-|  Mixer-B/16\*  |  59.88   |  12.61    | 76.68     | 92.25     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-base-p16_64xb64_in1k.py)  | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-base-p16_3rdparty_64xb64_in1k_20211124-1377e3e0.pth)  &#124; [log]()|
-|  Mixer-L/16\*  |  208.2   |  44.57    | 72.34     | 88.02     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-large-p16_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-large-p16_3rdparty_64xb64_in1k_20211124-5a2519d2.pth)  &#124; [log]()|
-| DeiT-tiny\* | 5.72 | 1.08 | 72.13 | 91.13 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-tiny_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_3rdparty_pt-4xb256_in1k_20211124-e930093b.pth)  &#124; [log]()|
-| DeiT-tiny distilled\* | 5.72 | 1.08 | 74.51 | 91.90 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-tiny-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny-distilled_3rdparty_pt-4xb256_in1k_20211216-c429839a.pth)  &#124; [log]()|
-| DeiT-small\* | 22.05 | 4.24 | 79.83 | 94.95 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-small_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_3rdparty_pt-4xb256_in1k_20211124-ffe94edd.pth)  &#124; [log]()|
-| DeiT-small distilled\* | 22.05 | 4.24 | 81.17 | 95.40 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-small-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small-distilled_3rdparty_pt-4xb256_in1k_20211216-4de1d725.pth)  &#124; [log]()|
-| DeiT-base\* | 86.57 | 16.86 | 81.79 | 95.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_pt-16xb64_in1k_20211124-6f40c188.pth)  &#124; [log]()|
-| DeiT-base distilled\* | 86.57 | 16.86 | 83.33 | 96.49 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base-distilled_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_pt-16xb64_in1k_20211216-42891296.pth)  &#124; [log]()|
-| DeiT-base 384px\* | 86.86 | 49.37 | 83.04 | 96.31 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_ft-16xb32_in1k-384px_20211124-822d02f2.pth)  &#124; [log]()|
-| DeiT-base distilled 384px\* | 86.86 | 49.37 | 85.55 | 97.35 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base-distilled_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_ft-16xb32_in1k-384px_20211216-e48d6000.pth)  &#124; [log]()|
-| Conformer-tiny-p16\*  |  23.52    | 4.90 | 81.31 | 95.60 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-tiny-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth)  &#124; [log]()|
-| Conformer-small-p32   |  38.85    | 7.09 | 81.96 | 96.02 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p32_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth)  &#124; [log]()|
-| Conformer-small-p16\* |  37.67    | 10.31 | 83.32 | 96.46 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth)  &#124; [log]()|
-| Conformer-base-p16\*  |  83.29    | 22.89 | 83.82 | 96.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-base-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth)  &#124; [log]()|
+| Transformer in Transformer small\* |   23.76  |  3.36 | 81.52 | 95.73 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/tnt/tnt-s-p16_16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/tnt/tnt-small-p16_3rdparty_in1k_20210903-c56ee7df.pth) |
+| T2T-ViT_t-14   |   21.47   |  4.34    | 81.83     | 95.84     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_8xb64_in1k_20211220-f7378dd5.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-14_8xb64_in1k_20211220-f7378dd5.log.json)|
+| T2T-ViT_t-19   |   39.08   |  7.80    | 82.63     | 96.18     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-19_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_8xb64_in1k_20211214-7f5e3aaf.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-19_8xb64_in1k_20211214-7f5e3aaf.log.json)|
+| T2T-ViT_t-24   |   64.00   | 12.69    | 82.71     | 96.09     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/t2t_vit/t2t-vit-t-24_8xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_8xb64_in1k_20211214-b2a68ae3.pth)  &#124; [log](https://download.openmmlab.com/mmclassification/v0/t2t-vit/t2t-vit-t-24_8xb64_in1k_20211214-b2a68ae3.log.json)|
+|  Mixer-B/16\*  |  59.88   |  12.61    | 76.68     | 92.25     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-base-p16_64xb64_in1k.py)  | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-base-p16_3rdparty_64xb64_in1k_20211124-1377e3e0.pth) |
+|  Mixer-L/16\*  |  208.2   |  44.57    | 72.34     | 88.02     | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/mlp_mixer/mlp-mixer-large-p16_64xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/mlp-mixer/mixer-large-p16_3rdparty_64xb64_in1k_20211124-5a2519d2.pth) |
+| DeiT-tiny\* | 5.72 | 1.08 | 72.13 | 91.13 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-tiny_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_3rdparty_pt-4xb256_in1k_20211124-e930093b.pth) |
+| DeiT-tiny distilled\* | 5.72 | 1.08 | 74.51 | 91.90 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-tiny-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny-distilled_3rdparty_pt-4xb256_in1k_20211216-c429839a.pth) |
+| DeiT-small\* | 22.05 | 4.24 | 79.83 | 94.95 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-small_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_3rdparty_pt-4xb256_in1k_20211124-ffe94edd.pth) |
+| DeiT-small distilled\* | 22.05 | 4.24 | 81.17 | 95.40 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-small-distilled_pt-4xb256_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small-distilled_3rdparty_pt-4xb256_in1k_20211216-4de1d725.pth) |
+| DeiT-base\* | 86.57 | 16.86 | 81.79 | 95.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_pt-16xb64_in1k_20211124-6f40c188.pth) |
+| DeiT-base distilled\* | 86.57 | 16.86 | 83.33 | 96.49 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base-distilled_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_pt-16xb64_in1k_20211216-42891296.pth) |
+| DeiT-base 384px\* | 86.86 | 49.37 | 83.04 | 96.31 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_ft-16xb32_in1k-384px_20211124-822d02f2.pth) |
+| DeiT-base distilled 384px\* | 86.86 | 49.37 | 85.55 | 97.35 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/deit/deit-base-distilled_ft-16xb32_in1k-384px.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base-distilled_3rdparty_ft-16xb32_in1k-384px_20211216-e48d6000.pth) |
+| Conformer-tiny-p16\*  |  23.52    | 4.90 | 81.31 | 95.60 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-tiny-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-tiny-p16_3rdparty_8xb128_in1k_20211206-f6860372.pth) |
+| Conformer-small-p32\* |  38.85    | 7.09 | 81.96 | 96.02 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p32_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p32_8xb128_in1k_20211206-947a0816.pth) |
+| Conformer-small-p16\* |  37.67    | 10.31 | 83.32 | 96.46 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-small-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-small-p16_3rdparty_8xb128_in1k_20211206-3065dcf5.pth) |
+| Conformer-base-p16\*  |  83.29    | 22.89 | 83.82 | 96.59 | [config](https://github.com/open-mmlab/mmclassification/blob/master/configs/conformer/conformer-base-p16_8xb128_in1k.py) | [model](https://download.openmmlab.com/mmclassification/v0/conformer/conformer-base-p16_3rdparty_8xb128_in1k_20211206-bfdf8637.pth) |

-Models with * are converted from other repos, others are trained by ourselves.
+*Models with \* are converted from other repos, others are trained by ourselves.*

 ## CIFAR10

--- a/mmcls/core/hook/init.py
+++ b/mmcls/core/hook/init.py
@ -1,5 +1,9 @@
 # Copyright (c) OpenMMLab. All rights reserved.
 from .class_num_check_hook import ClassNumCheckHook
+from .lr_updater import CosineAnnealingCooldownLrUpdaterHook
 from .precise_bn_hook import PreciseBNHook

-__all__ = ['ClassNumCheckHook', 'PreciseBNHook']
+__all__ = [
+    'ClassNumCheckHook', 'PreciseBNHook',
+    'CosineAnnealingCooldownLrUpdaterHook'
+]
--- a/mmcls/core/hook/lr_updater.py
+++ b/mmcls/core/hook/lr_updater.py
@ -0,0 +1,83 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+from math import cos, pi
+
+from mmcv.runner.hooks import HOOKS, LrUpdaterHook
+
+
+@HOOKS.register_module()
+class CosineAnnealingCooldownLrUpdaterHook(LrUpdaterHook):
+    """Cosine annealing learning rate scheduler with cooldown.
+
+    Args:
+        min_lr (float, optional): The minimum learning rate after annealing.
+            Defaults to None.
+        min_lr_ratio (float, optional): The minimum learning ratio after
+            nnealing. Defaults to None.
+        cool_down_ratio (float): The cooldown ratio. Defaults to 0.1.
+        cool_down_time (int): The cooldown time. Defaults to 10.
+        by_epoch (bool): If True, the learning rate changes epoch by epoch. If
+            False, the learning rate changes iter by iter. Defaults to True.
+        warmup (string, optional): Type of warmup used. It can be None (use no
+            warmup), 'constant', 'linear' or 'exp'. Defaults to None.
+        warmup_iters (int): The number of iterations or epochs that warmup
+            lasts. Defaults to 0.
+        warmup_ratio (float): LR used at the beginning of warmup equals to
+            ``warmup_ratio * initial_lr``. Defaults to 0.1.
+        warmup_by_epoch (bool): If True, the ``warmup_iters``
+            means the number of epochs that warmup lasts, otherwise means the
+            number of iteration that warmup lasts. Defaults to False.
+
+    Note:
+        You need to set one and only one of ``min_lr`` and ``min_lr_ratio``.
+    """
+
+    def __init__(self,
+                 min_lr=None,
+                 min_lr_ratio=None,
+                 cool_down_ratio=0.1,
+                 cool_down_time=10,
+                 **kwargs):
+        assert (min_lr is None) ^ (min_lr_ratio is None)
+        self.min_lr = min_lr
+        self.min_lr_ratio = min_lr_ratio
+        self.cool_down_time = cool_down_time
+        self.cool_down_ratio = cool_down_ratio
+        super(CosineAnnealingCooldownLrUpdaterHook, self).__init__(**kwargs)
+
+    def get_lr(self, runner, base_lr):
+        if self.by_epoch:
+            progress = runner.epoch
+            max_progress = runner.max_epochs
+        else:
+            progress = runner.iter
+            max_progress = runner.max_iters
+
+        if self.min_lr_ratio is not None:
+            target_lr = base_lr * self.min_lr_ratio
+        else:
+            target_lr = self.min_lr
+
+        if progress > max_progress - self.cool_down_time:
+            return target_lr * self.cool_down_ratio
+        else:
+            max_progress = max_progress - self.cool_down_time
+
+        return annealing_cos(base_lr, target_lr, progress / max_progress)
+
+
+def annealing_cos(start, end, factor, weight=1):
+    """Calculate annealing cos learning rate.
+
+    Cosine anneal from `weight * start + (1 - weight) * end` to `end` as
+    percentage goes from 0.0 to 1.0.
+
+    Args:
+        start (float): The starting learning rate of the cosine annealing.
+        end (float): The ending learing rate of the cosine annealing.
+        factor (float): The coefficient of `pi` when calculating the current
+            percentage. Range from 0.0 to 1.0.
+        weight (float, optional): The combination factor of `start` and `end`
+            when calculating the actual starting learning rate. Default to 1.
+    """
+    cos_out = cos(pi * factor) + 1
+    return end + 0.5 * weight * (start - end) * cos_out
--- a/tests/test_runtime/test_hooks.py
+++ b/tests/test_runtime/test_hooks.py
@ -0,0 +1,157 @@
+import logging
+import shutil
+import tempfile
+
+import numpy as np
+import pytest
+import torch
+import torch.nn as nn
+from mmcv.runner import build_runner
+from mmcv.runner.hooks import Hook, IterTimerHook
+from torch.utils.data import DataLoader
+
+import mmcls.core  # noqa: F401
+
+
+def _build_demo_runner_without_hook(runner_type='EpochBasedRunner',
+                                    max_epochs=1,
+                                    max_iters=None,
+                                    multi_optimziers=False):
+
+    class Model(nn.Module):
+
+        def __init__(self):
+            super().__init__()
+            self.linear = nn.Linear(2, 1)
+            self.conv = nn.Conv2d(3, 3, 3)
+
+        def forward(self, x):
+            return self.linear(x)
+
+        def train_step(self, x, optimizer, **kwargs):
+            return dict(loss=self(x))
+
+        def val_step(self, x, optimizer, **kwargs):
+            return dict(loss=self(x))
+
+    model = Model()
+
+    if multi_optimziers:
+        optimizer = {
+            'model1':
+            torch.optim.SGD(model.linear.parameters(), lr=0.02, momentum=0.95),
+            'model2':
+            torch.optim.SGD(model.conv.parameters(), lr=0.01, momentum=0.9),
+        }
+    else:
+        optimizer = torch.optim.SGD(model.parameters(), lr=0.02, momentum=0.95)
+
+    tmp_dir = tempfile.mkdtemp()
+    runner = build_runner(
+        dict(type=runner_type),
+        default_args=dict(
+            model=model,
+            work_dir=tmp_dir,
+            optimizer=optimizer,
+            logger=logging.getLogger(),
+            max_epochs=max_epochs,
+            max_iters=max_iters))
+    return runner
+
+
+def _build_demo_runner(runner_type='EpochBasedRunner',
+                       max_epochs=1,
+                       max_iters=None,
+                       multi_optimziers=False):
+
+    log_config = dict(
+        interval=1, hooks=[
+            dict(type='TextLoggerHook'),
+        ])
+
+    runner = _build_demo_runner_without_hook(runner_type, max_epochs,
+                                             max_iters, multi_optimziers)
+
+    runner.register_checkpoint_hook(dict(interval=1))
+    runner.register_logger_hooks(log_config)
+    return runner
+
+
+class ValueCheckHook(Hook):
+
+    def __init__(self, check_dict, by_epoch=False):
+        super().__init__()
+        self.check_dict = check_dict
+        self.by_epoch = by_epoch
+
+    def after_iter(self, runner):
+        if self.by_epoch:
+            return
+        if runner.iter in self.check_dict:
+            for attr, target in self.check_dict[runner.iter].items():
+                value = eval(f'runner.{attr}')
+                assert np.isclose(value, target), \
+                    (f'The value of `runner.{attr}` is {value}, '
+                     f'not equals to {target}')
+
+    def after_epoch(self, runner):
+        if not self.by_epoch:
+            return
+        if runner.epoch in self.check_dict:
+            for attr, target in self.check_dict[runner.epoch]:
+                value = eval(f'runner.{attr}')
+                assert np.isclose(value, target), \
+                    (f'The value of `runner.{attr}` is {value}, '
+                     f'not equals to {target}')
+
+
+@pytest.mark.parametrize('multi_optimziers', (True, False))
+def test_cosine_cooldown_hook(multi_optimziers):
+    """xdoctest -m tests/test_hooks.py test_cosine_runner_hook."""
+    loader = DataLoader(torch.ones((10, 2)))
+    runner = _build_demo_runner(multi_optimziers=multi_optimziers)
+
+    # add momentum LR scheduler
+    hook_cfg = dict(
+        type='CosineAnnealingCooldownLrUpdaterHook',
+        by_epoch=False,
+        cool_down_time=2,
+        cool_down_ratio=0.1,
+        min_lr_ratio=0.1,
+        warmup_iters=2,
+        warmup_ratio=0.9)
+    runner.register_hook_from_cfg(hook_cfg)
+    runner.register_hook_from_cfg(dict(type='IterTimerHook'))
+    runner.register_hook(IterTimerHook())
+
+    if multi_optimziers:
+        check_hook = ValueCheckHook({
+            0: {
+                'current_lr()["model1"][0]': 0.02,
+                'current_lr()["model2"][0]': 0.01,
+            },
+            5: {
+                'current_lr()["model1"][0]': 0.0075558491,
+                'current_lr()["model2"][0]': 0.0037779246,
+            },
+            9: {
+                'current_lr()["model1"][0]': 0.0002,
+                'current_lr()["model2"][0]': 0.0001,
+            }
+        })
+    else:
+        check_hook = ValueCheckHook({
+            0: {
+                'current_lr()[0]': 0.02,
+            },
+            5: {
+                'current_lr()[0]': 0.0075558491,
+            },
+            9: {
+                'current_lr()[0]': 0.0002,
+            }
+        })
+    runner.register_hook(check_hook, priority='LOWEST')
+
+    runner.run([loader], [('train', 1)])
+    shutil.rmtree(runner.work_dir)