History

sennnnn d35fbbdb47 [Enhancement] Add Dev tools to boost develop (#798 ) * Modify default work dir when training. * Refactor gather_models.py. * Add train and test matching list. * Regression benchmark list. * lower readme name to upper readme name. * Add url check tool and model inference test tool. * Modify tool name. * Support duplicate mode of log json url check. * Add regression benchmark evaluation automatic tool. * Add train script generator. * Only Support script running. * Add evaluation results gather. * Add exec Authority. * Automatically make checkpoint root folder. * Modify gather results save path. * Coarse-grained train results gather tool. * Complete benchmark train script. * Make some little modifications. * Fix checkpoint urls. * Fix unet checkpoint urls. * Fix fast scnn & fcn checkpoint url. * Fix fast scnn checkpoint urls. * Fix fast scnn url. * Add differential results calculation. * Add differential results of regression benchmark train results. * Add an extra argument to select model. * Update nonlocal_net & hrnet checkpoint url. * Fix checkpoint url of hrnet and Fix some tta evaluation results and modify gather models tool. * Modify fast scnn checkpoint url. * Resolve new comments. * Fix url check status code bug. * Resolve some comments. * Modify train scripts generator. * Modify work_dir of regression benchmark results. * model gather tool modification.		2021-09-02 09:44:51 -07:00
..
README.md	[Enhancement] Add Dev tools to boost develop (#798 )	2021-09-02 09:44:51 -07:00
upernet_deit-b16_512x512_80k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-b16_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-b16_ln_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-b16_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-s16_512x512_80k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-s16_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-s16_ln_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_deit-s16_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_vit-b16_ln_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_vit-b16_mln_512x512_80k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
upernet_vit-b16_mln_512x512_160k_ade20k.py	[Enhancement] Delete convert function and add instruction to ViT/Swin README.md (#791 )	2021-08-25 15:00:41 -07:00
vit.yml	[Enhancement] Add Dev tools to boost develop (#798 )	2021-09-02 09:44:51 -07:00

README.md

Vision Transformer

Introduction

@article{dosoViTskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={DosoViTskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

Usage

To use other repositories' pre-trained models, it is necessary to convert keys.

We provide a script vit2mmseg.py in the tools directory to convert the key of models from timm to MMSegmentation style.

python tools/model_converters/vit2mmseg.py ${PRETRAIN_PATH} ${STORE_PATH}

E.g.

python tools/model_converters/vit2mmseg.py https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_224-80ecf9dd.pth pretrain/jx_vit_base_p16_224-80ecf9dd.pth

This script convert model from PRETRAIN_PATH and store the converted model in STORE_PATH.

Results and models

ADE20K

Method	Backbone	Crop Size	Lr schd	Mem (GB)	Inf time (fps)	mIoU	mIoU(ms+flip)	config	download
UPerNet	ViT-B + MLN	512x512	80000	9.20	6.94	47.71	49.51	config	model \| log
UPerNet	ViT-B + MLN	512x512	160000	9.20	7.58	46.75	48.46	config	model \| log
UPerNet	ViT-B + LN + MLN	512x512	160000	9.21	6.82	47.73	49.95	config	model \| log
UPerNet	DeiT-S	512x512	80000	4.68	29.85	42.96	43.79	config	model \| log
UPerNet	DeiT-S	512x512	160000	4.68	29.19	42.87	43.79	config	model \| log
UPerNet	DeiT-S + MLN	512x512	160000	5.69	11.18	43.82	45.07	config	model \| log
UPerNet	DeiT-S + LN + MLN	512x512	160000	5.69	12.39	43.52	45.01	config	model \| log
UPerNet	DeiT-B	512x512	80000	7.75	9.69	45.24	46.73	config	model \| log
UPerNet	DeiT-B	512x512	160000	7.75	10.39	45.36	47.16	config	model \| log
UPerNet	DeiT-B + MLN	512x512	160000	9.21	7.78	45.46	47.16	config	model \| log
UPerNet	DeiT-B + LN + MLN	512x512	160000	9.21	7.75	45.37	47.23	config	model \| log