* update links * update readtherdocs * update * update * fix lint * update * update * update * update cov branch * update * update * update |
||
---|---|---|
.. | ||
config | ||
README.md |
README.md
Solution of FGIA ACCV 2022(1st Place)
This is fine-tuning part of the 1st Place Solution for Webly-supervised Fine-grained Recognition, refer to the ACCV workshop competition in https://www.cvmart.net/race/10412/base.
Result
Reproduce
For detailed self-supervised pretrain code, please refer to Self-spervised Pre-training. For detailed finetuning and inference code, please refer to this repo.
Description
Overview of Our Solution
Our Model
- ViT(MAE-pre-train) # Pretrained with MAE
- Swin-v2(SimMIM-pre-train) # From MMPretrain-swin_transformer_v2.
**The architectures we use **
- ViT + CE-loss + post-LongTail-Adjusment
- ViT + SubCenterArcFaceWithAdvMargin(CE)
- Swin-B + SubCenterArcFaceWithAdvMargin(SoftMax-EQL)
- Swin-L + SubCenterArcFaceWithAdvMargin(SoftMAx-EQL)
Self-supervised Pre-training
Requirements
PyTorch 1.11.0
torchvision 0.12.0
CUDA 11.3
MMEngine >= 0.1.0
MMCV >= 2.0.0rc0
Preparing the dataset
First you should refactor the folder of your dataset in the following format:
mmpretrain
|
|── data
| |── WebiNat5000
| | |── meta
| | | |── train.txt
| | |── train
| | |── testa
| | |── testb
The train
, testa
, and testb
folders contain the same content with
those provided by the official website of the competition.
Start pre-training
First, you should install all these requirements, following this page. Then change your current directory to the root of MMPretrain
cd $MMPretrain
Then you have the following two choices to start pre-training
Slurm
If you have a cluster managed by Slurm, you can use the following command:
## we use 16 NVIDIA 80G A100 GPUs for pre-training
GPUS_PER_NODE=8 GPUS=16 SRUN_ARGS=${SRUN_ARGS} bash tools/slurm_train.sh ${PARTITION} ${JOB_NAME} projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py [optional arguments]
Pytorch
Or you can use the following two commands to start distributed training on two separate nodes:
# node 1
NNODES=2 NODE_RANK=0 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} bash tools/dist_train.sh projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py 8
# node 2
NNODES=2 NODE_RANK=1 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} bash tools/dist_train.sh projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py 8
All these logs and checkpoints will be saved under the folder work_dirs
in the root.
Fine-tuning with bag of tricks
- MAE | Config
- Swinv2 | Config
- ArcFace | Code
- SubCenterArcFaceWithAdvMargin | Code
- Post-LT-adjusment | Code
- SoftMaxEQL | Code
- FlipTTA Code
- clean dataset
- self-emsemble: Uniform-model-soup | code
- pseudo | Code
- bagging-emsemble Code,
- post-process: re-distribute-label;
Used but no improvements
- Using retrieval paradigm to solve this classification task;
- Using EfficientNetv2 backbone.
Not used but worth to do
- Try DiVE algorithm to improve performance in long tail dataset;
- Use SimMIM to pre-train Swin-v2 on the competition dataset;
- refine the re-distribute-label tool.