diff --git a/README.md b/README.md index 79de56e..415a546 100644 --- a/README.md +++ b/README.md @@ -32,6 +32,7 @@ This is the *default* setting for most hyper-parameters. With a batch size of 40 On the first node, run: ``` python main_moco.py \ + --moco-m-cos \ --dist-url 'tcp://[your node 1 address]:[specified port]'' \ --multiprocessing-distributed --world-size 2 --rank 0 \ [your imagenet-folder with train and val folders] @@ -100,7 +101,19 @@ python main_lincls.py \ ``` -### Reference Setups +### End-to-End Classification + +To perform end-to-end fine-tuning for ImageNet classification, first convert the pre-trained checkpoints to [DEiT](https://github.com/facebookresearch/deit) format: +``` +python convert_to_deit.py \ + --input [your checkpoint path]/[your checkpoint file].pth.tar \ + --output [target checkpoint file].pth +``` +Then use `[target checkpoint file].pth` to initialize weights in DEiT. + +With 100-epoch fine-tuning, the reference top-1 classification accuracy is 82.8%. With 300-epoch, the accuracy is 83.2%. + +### Reference Setups and Models For longer pre-trainings with ResNet-50, we find the following hyper-parameters work well (expected performance in the last column, will update logs/pre-trained models soon): @@ -111,27 +124,31 @@ For longer pre-trainings with ResNet-50, we find the following hyper-parameters