PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

contrastive-loss deep-learning machine-learning pytorch pytorch-implementation representation-learning simclr torchvision unsupervised-learning

Go to file

Thalles Silva 651d40ae61 Major refactor, small fixes		2021-01-18 08:12:20 -03:00
data_aug	Major refactor, small fixes	2021-01-17 20:07:59 -03:00
exceptions	Major refactor, small fixes	2021-01-17 20:07:59 -03:00
feature_eval	Created using Colaboratory	2021-01-18 07:33:35 -03:00
models	Major refactor	2021-01-17 14:12:17 -03:00
LICENSE.txt	Create LICENSE.txt	2020-03-17 09:53:47 -03:00
README.md	Update README.md	2021-01-18 07:48:52 -03:00
env.yml	Add package versions	2020-04-16 11:51:56 +02:00
requirements.txt	added requirements.txt file	2020-03-13 23:13:54 -03:00
run.py	Major refactor, small fixes	2021-01-18 07:33:12 -03:00
simclr.py	Major refactor, small fixes	2021-01-18 08:12:20 -03:00
utils.py	Major refactor, small fixes	2021-01-18 08:12:20 -03:00

README.md

PyTorch SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Blog post with full documentation: Exploring SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Installation

$ conda env create --name simclr --file env.yml
$ conda activate simclr
$ python run.py

Config file

Before running SimCLR, make sure you choose the correct running configurations. You can change the running configurations by passing keyword arguments to the run.py file.


$ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100

If you want to run it on CPU (for debugging purposes) use the --disable-cuda option.

For 16-bit precision GPU training, make sure to install NVIDIA apex and use the --fp16_precision flag.

Feature Evaluation

Feature evaluation is done using a linear model protocol.

First, we learned features using SimCLR on the STL10 unsupervised set. Then, we train a linear classifier on top of the frozen features from SimCLR. The linera model is trained on features extracted from the STL10 train set and evaluated on the STL10 test set.

Check the notebook for reproducibility.

Note that SimCLR benefits from longer training.

Linear Classification	Dataset	Feature Extractor	Architecture	Feature dimensionality	Projection Head dimensionality	Epochs	Top1 %
Logistic Regression (Adam)	STL10	SimCLR	ResNet-18	512	128	100	70.45
Logistic Regression (Adam)	CIFAR10	SimCLR	ResNet-18	512	128	100	64.82
Logistic Regression (Adam)	STL10	SimCLR	ResNet-50	2048	128	50	67.075