History

Yixiao Fang 1ee9bbe050 [Docs] Update links (#1457 ) * update links * update readtherdocs * update * update * fix lint * update * update * update * update cov branch * update * update * update		2023-04-06 20:58:52 +08:00
..
config	[Refactor] Add projects from mmselfsup (#1410 )	2023-03-27 16:59:08 +08:00
README.md	[Docs] Update links (#1457 )	2023-04-06 20:58:52 +08:00

README.md

Solution of FGIA ACCV 2022(1st Place)

This is fine-tuning part of the 1st Place Solution for Webly-supervised Fine-grained Recognition, refer to the ACCV workshop competition in https://www.cvmart.net/race/10412/base.

Result

Show the result

Leaderboard A

Leaderboard B

Reproduce

For detailed self-supervised pretrain code, please refer to Self-spervised Pre-training. For detailed finetuning and inference code, please refer to this repo.

Description

Overview of Our Solution

Our Model

ViT(MAE-pre-train) # Pretrained with MAE
Swin-v2(SimMIM-pre-train) # From MMPretrain-swin_transformer_v2.

**The architectures we use **

ViT + CE-loss + post-LongTail-Adjusment
ViT + SubCenterArcFaceWithAdvMargin(CE)
Swin-B + SubCenterArcFaceWithAdvMargin(SoftMax-EQL)
Swin-L + SubCenterArcFaceWithAdvMargin(SoftMAx-EQL)

Self-supervised Pre-training

Requirements

PyTorch 1.11.0
torchvision 0.12.0
CUDA 11.3
MMEngine >= 0.1.0
MMCV >= 2.0.0rc0

Preparing the dataset

First you should refactor the folder of your dataset in the following format:

mmpretrain
|
|── data
|    |── WebiNat5000
|    |       |── meta
|    |       |    |── train.txt
|    |       |── train
|    |       |── testa
|    |       |── testb

The train, testa, and testb folders contain the same content with those provided by the official website of the competition.

Start pre-training

First, you should install all these requirements, following this page. Then change your current directory to the root of MMPretrain

cd $MMPretrain

Then you have the following two choices to start pre-training

Slurm

If you have a cluster managed by Slurm, you can use the following command:

## we use 16 NVIDIA 80G A100 GPUs for pre-training
GPUS_PER_NODE=8 GPUS=16 SRUN_ARGS=${SRUN_ARGS} bash tools/slurm_train.sh ${PARTITION} ${JOB_NAME} projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py [optional arguments]

Pytorch

Or you can use the following two commands to start distributed training on two separate nodes:

# node 1
NNODES=2 NODE_RANK=0 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} bash tools/dist_train.sh projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py 8

# node 2
NNODES=2 NODE_RANK=1 PORT=${MASTER_PORT} MASTER_ADDR=${MASTER_ADDR} bash tools/dist_train.sh projects/fgia_accv2022_1st/config/mae_vit-large-p16_8xb512-amp-coslr-1600e_in1k.py 8

All these logs and checkpoints will be saved under the folder work_dirsin the root.

Fine-tuning with bag of tricks

MAE | Config
Swinv2 | Config
ArcFace | Code
SubCenterArcFaceWithAdvMargin | Code
Post-LT-adjusment | Code
SoftMaxEQL | Code
FlipTTA Code
clean dataset
self-emsemble: Uniform-model-soup | code
pseudo | Code
bagging-emsemble Code,
post-process: re-distribute-label;

Used but no improvements

Using retrieval paradigm to solve this classification task;
Using EfficientNetv2 backbone.

Not used but worth to do

Try DiVE algorithm to improve performance in long tail dataset;
Use SimMIM to pre-train Swin-v2 on the competition dataset;
refine the re-distribute-label tool.