moco-v3/transfer/README.md

## MoCo v3 Transfer Learning with ViT

This folder includes the transfer learning experiments on CIFAR-10, CIFAR-100, Flowers and Pets datasets. We provide finetuning recipes for the ViT-Base model.

### Transfer Results

The following results are based on ImageNet-1k self-supervised pre-training, followed by end-to-end fine-tuning on downstream datasets. All results are based on a batch size of 128 and 100 training epochs.

#### ViT-Base, transfer learning
<table><tbody>
<!-- START TABLE -->
<!-- TABLE HEADER -->
<th valign="center">dataset</th>
<th valign="center">pretrain<br/>epochs</th>
<th valign="center">pretrain<br/>crops</th>
<th valign="center">finetune<br/>epochs</th>
<th valign="center">transfer<br/>acc</th>
<!-- TABLE BODY -->
<tr>
<td align="left">CIFAR-10</td>
<td align="right">300</td>
<td align="center">2x224</td>
<td align="right">100</td>
<td align="center">98.9</td>
</tr>
<tr>
<td align="left">CIFAR-100</td>
<td align="right">300</td>
<td align="center">2x224</td>
<td align="right">100</td>
<td align="center">90.5</td>
</tr>
<tr>
<td align="left">Flowers</td>
<td align="right">300</td>
<td align="center">2x224</td>
<td align="right">100</td>
<td align="center">97.7</td>
</tr>
<tr>
<td align="left">Pets</td>
<td align="right">300</td>
<td align="center">2x224</td>
<td align="right">100</td>
<td align="center">93.2</td>
</tr>
</tbody></table>

Similar to the end-to-end fine-tuning experiment on ImageNet, the transfer learning results are also obtained using the [DeiT](https://github.com/facebookresearch/deit) repo, with the default model [deit_base_patch16_224]. 

### Preparation: Transfer learning with ViT

To perform transfer learning for ViT, use our script to convert the pre-trained ViT checkpoint to [DEiT](https://github.com/facebookresearch/deit) format:
```
python convert_to_deit.py \
  --input [your checkpoint path]/[your checkpoint file].pth.tar \
  --output [target checkpoint file].pth
```
Then copy (or replace) the following files to the DeiT folder: 
```
datasets.py
oxford_flowers_dataset.py
oxford_pets_dataset.py 
```

#### Download and prepare the datasets

Pets [\[Homepage\]](https://www.robots.ox.ac.uk/~vgg/data/pets/)
```
./data/
└── ./data/pets/
    ├── ./data/pets/annotations/               # split and label files
    └── ./data/pets/images/                    # data images
```

Flowers [\[Homepage\]](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)
```
./data/
└── ./data/flowers/
    ├── ./data/flowers/jpg/               # jpg images
    ├── ./data/flowers/setid.mat          # dataset split   
    └── ./data/flowers/imagelabels.mat    # labels   
```


CIFAR-10/CIFAR-100 datasets will be downloaded automatically.


### Transfer learning scripts (with a 8-GPU machine):

#### CIFAR-10
```
python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
    --batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \
    --no-pin-mem  --warmup-epochs 3 --data-set cifar10 --data-path [cifar-10 data path]  --no-repeated-aug \
    --resume [your pretrain checkpoint file] \
    --reprob 0.0 --drop-path 0.1 --mixup 0.8 --cutmix 1
```

#### CIFAR-100
```
python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
    --batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \
    --no-pin-mem  --warmup-epochs 3 --data-set cifar10 --data-path [cifar-100 data path]  --no-repeated-aug \
    --resume [your pretrain checkpoint file] \
    --reprob 0.0 --drop-path 0.1 --mixup 0.5 --cutmix 1
```

#### Flowers
```
python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
    --batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.3 --eval-freq 10 \
    --no-pin-mem  --warmup-epochs 3 --data-set cifar10 --data-path [oxford-flowers data path]  --no-repeated-aug \
    --resume [your pretrain checkpoint file] \
    --reprob 0.25 --drop-path 0.1 --mixup 0 --cutmix 0
```

#### Pets
```
python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
    --batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \
    --no-pin-mem  --warmup-epochs 3 --data-set cifar10 --data-path [oxford-pets data path]  --no-repeated-aug \
    --resume [your pretrain checkpoint file] \
    --reprob 0 --drop-path 0 --mixup 0.8 --cutmix 0
```

**Note**:
Similar to the ImageNet end-to-end finetuning experiment, we use `--resume` rather than `--finetune` in the DeiT repo, as its `--finetune` option trains under eval mode. When loading the pre-trained model, revise `model_without_ddp.load_state_dict(checkpoint['model'])` with `strict=False`.
Initial commit 2021-08-17 05:41:01 +08:00			`## MoCo v3 Transfer Learning with ViT`

			`This folder includes the transfer learning experiments on CIFAR-10, CIFAR-100, Flowers and Pets datasets. We provide finetuning recipes for the ViT-Base model.`

			`### Transfer Results`

			`The following results are based on ImageNet-1k self-supervised pre-training, followed by end-to-end fine-tuning on downstream datasets. All results are based on a batch size of 128 and 100 training epochs.`

			`#### ViT-Base, transfer learning`
			`<table><tbody>`
			`<!-- START TABLE -->`
			`<!-- TABLE HEADER -->`
			`<th valign="center">dataset</th>`
			`<th valign="center">pretrain<br/>epochs</th>`
			`<th valign="center">pretrain<br/>crops</th>`
			`<th valign="center">finetune<br/>epochs</th>`
			`<th valign="center">transfer<br/>acc</th>`
			`<!-- TABLE BODY -->`
			`<tr>`
			`<td align="left">CIFAR-10</td>`
			`<td align="right">300</td>`
			`<td align="center">2x224</td>`
			`<td align="right">100</td>`
			`<td align="center">98.9</td>`
			`</tr>`
			`<tr>`
			`<td align="left">CIFAR-100</td>`
			`<td align="right">300</td>`
			`<td align="center">2x224</td>`
			`<td align="right">100</td>`
			`<td align="center">90.5</td>`
			`</tr>`
			`<tr>`
			`<td align="left">Flowers</td>`
			`<td align="right">300</td>`
			`<td align="center">2x224</td>`
			`<td align="right">100</td>`
			`<td align="center">97.7</td>`
			`</tr>`
			`<tr>`
			`<td align="left">Pets</td>`
			`<td align="right">300</td>`
			`<td align="center">2x224</td>`
			`<td align="right">100</td>`
			`<td align="center">93.2</td>`
			`</tr>`
			`</tbody></table>`

			`Similar to the end-to-end fine-tuning experiment on ImageNet, the transfer learning results are also obtained using the [DeiT](https://github.com/facebookresearch/deit) repo, with the default model [deit_base_patch16_224].`

			`### Preparation: Transfer learning with ViT`

			`To perform transfer learning for ViT, use our script to convert the pre-trained ViT checkpoint to [DEiT](https://github.com/facebookresearch/deit) format:`
			```
			`python convert_to_deit.py \`
			`--input [your checkpoint path]/[your checkpoint file].pth.tar \`
			`--output [target checkpoint file].pth`
			```
			`Then copy (or replace) the following files to the DeiT folder:`
			```
			`datasets.py`
			`oxford_flowers_dataset.py`
			`oxford_pets_dataset.py`
			```

			`#### Download and prepare the datasets`

			`Pets [\[Homepage\]](https://www.robots.ox.ac.uk/~vgg/data/pets/)`
			```
			`./data/`
			`└── ./data/pets/`
			`├── ./data/pets/annotations/ # split and label files`
			`└── ./data/pets/images/ # data images`
			```

			`Flowers [\[Homepage\]](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)`
			```
			`./data/`
			`└── ./data/flowers/`
			`├── ./data/flowers/jpg/ # jpg images`
			`├── ./data/flowers/setid.mat # dataset split`
			`└── ./data/flowers/imagelabels.mat # labels`
			```


			`CIFAR-10/CIFAR-100 datasets will be downloaded automatically.`


			`### Transfer learning scripts (with a 8-GPU machine):`

			`#### CIFAR-10`
			```
			`python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \`
			`--batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \`
			`--no-pin-mem --warmup-epochs 3 --data-set cifar10 --data-path [cifar-10 data path] --no-repeated-aug \`
			`--resume [your pretrain checkpoint file] \`
			`--reprob 0.0 --drop-path 0.1 --mixup 0.8 --cutmix 1`
			```

			`#### CIFAR-100`
			```
			`python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \`
			`--batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \`
			`--no-pin-mem --warmup-epochs 3 --data-set cifar10 --data-path [cifar-100 data path] --no-repeated-aug \`
			`--resume [your pretrain checkpoint file] \`
			`--reprob 0.0 --drop-path 0.1 --mixup 0.5 --cutmix 1`
			```

			`#### Flowers`
			```
			`python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \`
			`--batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.3 --eval-freq 10 \`
			`--no-pin-mem --warmup-epochs 3 --data-set cifar10 --data-path [oxford-flowers data path] --no-repeated-aug \`
			`--resume [your pretrain checkpoint file] \`
			`--reprob 0.25 --drop-path 0.1 --mixup 0 --cutmix 0`
			```

			`#### Pets`
			```
			`python -u -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \`
			`--batch-size 128 --output_dir [your output dir path] --epochs 100 --lr 3e-4 --weight-decay 0.1 --eval-freq 10 \`
			`--no-pin-mem --warmup-epochs 3 --data-set cifar10 --data-path [oxford-pets data path] --no-repeated-aug \`
			`--resume [your pretrain checkpoint file] \`
			`--reprob 0 --drop-path 0 --mixup 0.8 --cutmix 0`
			```

			`Note:`
			Similar to the ImageNet end-to-end finetuning experiment, we use `--resume` rather than `--finetune` in the DeiT repo, as its `--finetune` option trains under eval mode. When loading the pre-trained model, revise `model_without_ddp.load_state_dict(checkpoint['model'])` with `strict=False`.