122 lines
4.6 KiB
Markdown
122 lines
4.6 KiB
Markdown
# InternImage Classification
|
|
|
|
## Description
|
|
|
|
This is the implementation of [InternImage](https://arxiv.org/abs/2211.05778) for image classification.
|
|
|
|
## Usage
|
|
|
|
### Setup Environment
|
|
|
|
Please refer to [Get Started](https://mmpretrain.readthedocs.io/en/latest/get_started.html) documentation of MMPretrain to finish installation.
|
|
|
|
Please install DCNv3. Run the command below following the [ InternImage official installation instructions](https://github.com/OpenGVLab/InternImage/blob/master/classification/README.md).
|
|
|
|
```shell
|
|
cd ops_dcnv3
|
|
sh ./make.sh
|
|
```
|
|
|
|
### Training and Test Commands
|
|
|
|
At first, you need to add the current folder to `PYTHONPATH`, so that Python can find your model files. In `projects/internimage_classification/` root directory, please run command below to add it.
|
|
|
|
```shell
|
|
export PYTHONPATH=`pwd`:$PYTHONPATH
|
|
```
|
|
|
|
#### Training
|
|
|
|
##### On Local Single GPU
|
|
|
|
```bash
|
|
# train with mim
|
|
mim train mmpretrain ${CONFIG} --work-dir ${WORK_DIR}
|
|
|
|
# a specific command example
|
|
mim train mmpretrain configs/internimage-tiny_8xb128_in1k-224.py \
|
|
--work-dir work_dirs/internimage-tiny_8xb128_in1k-224/
|
|
```
|
|
|
|
##### On Multiple GPUs
|
|
|
|
```bash
|
|
# train with mim
|
|
mim train mmpretrain ${CONFIG} \
|
|
--work-dir ${WORK_DIR} \
|
|
--launcher pytorch --gpus 8
|
|
```
|
|
|
|
##### On Multiple GPUs with Slurm
|
|
|
|
```bash
|
|
# train with mim
|
|
mim train mmpretrain ${CONFIG} \
|
|
--work-dir ${WORK_DIR} \
|
|
--launcher slurm --gpus 16 --gpus-per-node 8 \
|
|
--partition ${PARTITION}
|
|
```
|
|
|
|
#### Test
|
|
|
|
Please download the pretrain weight provided by [OpenGVLab](https://github.com/OpenGVLab/) from [here](https://huggingface.co/OpenGVLab/InternImage/tree/main)
|
|
|
|
##### On Local Single GPU
|
|
|
|
```bash
|
|
# test with mim
|
|
mim test mmpretrain ${CONFIG} -C ${CHECKPOINT}
|
|
|
|
# a specific command example
|
|
mim test mmpretrain configs/internimage-tiny_8xb128_in1k-224.py -C /PATH/TO/internimage_t_1k_224.pth
|
|
```
|
|
|
|
##### On Multiple GPUs
|
|
|
|
```bash
|
|
# test with mim
|
|
# a specific command examples, 8 GPUs here
|
|
mim test mmpretrain configs/internimage_t_1k_224.py \
|
|
-C /PATH/TO/internimage_t_1k_224.pth \
|
|
--launcher pytorch --gpus 8
|
|
```
|
|
|
|
##### On Multiple GPUs with Slurm
|
|
|
|
```bash
|
|
# test with mim
|
|
mim test mmpretrain ${CONFIG} \
|
|
-C ${CHECKPOINT}
|
|
--work-dir ${WORK_DIR} \
|
|
--launcher slurm --gpus 8 --gpus-per-node 8 \
|
|
--partition ${PARTITION} \
|
|
$PY_ARGS
|
|
```
|
|
|
|
Note: `PY_ARGS` is other optional args.
|
|
|
|
## Results on ImageNet1K
|
|
|
|
The accuracy of different models on ImageNet1K,
|
|
|
|
| name | resolution | acc@1 | acc@5 | config | weight |
|
|
| :------------: | :--------: | :-----: | :-----: | :-------------------------------------------------------: | :-----------------------------------------------------------------------------------------------: |
|
|
| InternImage-T | 224 | 83.4700 | 96.5340 | [config](./configs/internimage-tiny_8xb128_in1k-224.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_t_1k_224.pth) |
|
|
| InternImage-S | 224 | 84.1640 | 96.9320 | [config](./configs/internimage-small_8xb128_in1k-224.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_s_1k_224.pth) |
|
|
| InternImage-B | 224 | 84.8660 | 97.1820 | [config](./configs/internimage-base_8xb128_in1k-224.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_b_1k_224.pth) |
|
|
| InternImage-L | 384 | 87.7060 | 98.3820 | [config](./configs/internimage-large_8xb128_in1k-384.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22kto1k_384.pth) |
|
|
| InternImage-XL | 384 | 88.0460 | 98.5620 | [config](./configs/internimage-xlagre_8xb128_in1k-384.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22kto1k_384.pth) |
|
|
| InternImage-H | 640 | 89.5500 | 98.8500 | [config](./configs/internimage-huge_8xb128_in1k-640.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_22kto1k_640.pth) |
|
|
| InternImage-G | 512 | 90.0580 | 98.9700 | [config](./configs/internimage-giant_8xb128_in1k-512.py) | [model](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_22kto1k_512.pth) |
|
|
|
|
## Citation
|
|
|
|
```bibtex
|
|
@article{wang2022internimage,
|
|
title={InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions},
|
|
author={Wang, Wenhai and Dai, Jifeng and Chen, Zhe and Huang, Zhenhang and Li, Zhiqi and Zhu, Xizhou and Hu, Xiaowei and Lu, Tong and Lu, Lewei and Li, Hongsheng and others},
|
|
journal={arXiv preprint arXiv:2211.05778},
|
|
year={2022}
|
|
}
|
|
```
|