SimCLR/README.md

# PyTorch SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

### Blog post with full documentation: [Exploring SimCLR: A Simple Framework for Contrastive Learning of Visual Representations](https://sthalles.github.io/simple-self-supervised-learning/)

![Image of SimCLR Arch](https://sthalles.github.io/assets/contrastive-self-supervised/cover.png)

### See also [PyTorch Implementation for BYOL - Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning](https://github.com/sthalles/PyTorch-BYOL).

## Installation

```
$ conda env create --name simclr --file env.yml
$ conda activate simclr
$ python run.py
```

## Config file

Before running SimCLR, make sure you choose the correct running configurations. You can change the running configurations by passing keyword arguments to the ```run.py``` file.

```python

$ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100 

```

If you want to run it on CPU (for debugging purposes) use the ```--disable-cuda``` option.

For 16-bit precision GPU training, make sure to install [NVIDIA apex](https://github.com/NVIDIA/apex) and use the ```--fp16_precision``` flag.

## Feature Evaluation

Feature evaluation is done using a linear model protocol. 

First, we learned features using SimCLR on the ```STL10 unsupervised``` set. Then, we train a linear classifier on top of the frozen features from SimCLR. The linera model is trained on features extracted from the ```STL10 train``` set and evaluated on the ```STL10 test``` set. 

Check the [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/sthalles/SimCLR/blob/simclr-refactor/feature_eval/mini_batch_logistic_regression_evaluator.ipynb) notebook for reproducibility.

Note that SimCLR benefits from **longer training**.

| Linear Classification      | Dataset | Feature Extractor | Architecture                                                                    | Feature dimensionality | Projection Head dimensionality | Epochs | Top1 % |
|----------------------------|---------|-------------------|---------------------------------------------------------------------------------|------------------------|--------------------------------|--------|--------|
| Logistic Regression (Adam) | STL10   | SimCLR            | [ResNet-18](https://drive.google.com/open?id=14_nH2FkyKbt61cieQDiSbBVNP8-gtwgF) | 512                    | 128                            | 100    | 70.45  |
| Logistic Regression (Adam) | CIFAR10 | SimCLR            | [ResNet-18](https://drive.google.com/open?id=1lc2aoVtrAetGn0PnTkOyFzPCIucOJq7C) | 512                    | 128                            | 100    | 64.82  |
| Logistic Regression (Adam) | STL10   | SimCLR            | [ResNet-50](https://drive.google.com/open?id=1ByTKAUsdm_X7tLcii6oAEl5qFRqRMZSu) | 2048                   | 128                            | 50     | 67.075 |
Update README.md 2020-03-16 20:28:24 -03:00			`# PyTorch SimCLR: A Simple Framework for Contrastive Learning of Visual Representations`
complete the loss function description from the paper 2020-02-24 15:36:10 -03:00
Update README.md 2020-03-07 12:01:58 -03:00			`### Blog post with full documentation: [Exploring SimCLR: A Simple Framework for Contrastive Learning of Visual Representations](https://sthalles.github.io/simple-self-supervised-learning/)`

Update README.md 2020-03-07 11:56:34 -03:00			`![Image of SimCLR Arch](https://sthalles.github.io/assets/contrastive-self-supervised/cover.png)`

Update README.md 2020-06-19 13:04:17 -03:00			`### See also [PyTorch Implementation for BYOL - Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning](https://github.com/sthalles/PyTorch-BYOL).`
Update README.md 2020-06-18 12:16:09 -03:00
Update README.md 2020-03-13 23:12:08 -03:00			`## Installation`
Update README.md 2020-03-07 12:01:58 -03:00
Update README.md 2020-03-13 23:11:07 -03:00			```
Fix environment file 2020-04-15 16:50:06 +02:00			`$ conda env create --name simclr --file env.yml`
Update README.md 2020-03-13 23:11:07 -03:00			`$ conda activate simclr`
			`$ python run.py`
			```
Update README.md 2020-02-25 08:49:51 -03:00
			`## Config file`

Major refactor, small fixes 2021-01-18 07:33:12 -03:00			Before running SimCLR, make sure you choose the correct running configurations. You can change the running configurations by passing keyword arguments to the ```run.py``` file.

			```python

			`$ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100`

Update README.md 2020-02-25 08:49:51 -03:00			```
Update README.md 2020-02-25 08:53:40 -03:00
Major refactor, small fixes 2021-01-18 07:33:12 -03:00			If you want to run it on CPU (for debugging purposes) use the ```--disable-cuda``` option.

Update README.md 2021-01-18 07:39:39 -03:00			For 16-bit precision GPU training, make sure to install [NVIDIA apex](https://github.com/NVIDIA/apex) and use the ```--fp16_precision``` flag.
Major refactor, small fixes 2021-01-18 07:38:08 -03:00
Update README.md 2020-02-25 08:53:40 -03:00			`## Feature Evaluation`

Update README.md 2020-03-13 20:33:50 -03:00			`Feature evaluation is done using a linear model protocol.`

Major refactor, small fixes 2021-01-18 07:33:12 -03:00			First, we learned features using SimCLR on the ```STL10 unsupervised``` set. Then, we train a linear classifier on top of the frozen features from SimCLR. The linera model is trained on features extracted from the ```STL10 train``` set and evaluated on the ```STL10 test``` set.
Update README.md 2020-03-07 11:52:02 -03:00
Major refactor, small fixes 2021-01-18 07:33:12 -03:00			`Check the [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/sthalles/SimCLR/blob/simclr-refactor/feature_eval/mini_batch_logistic_regression_evaluator.ipynb) notebook for reproducibility.`
Update README.md 2020-03-13 20:33:50 -03:00
Update README.md 2021-01-18 07:48:52 -03:00			`Note that SimCLR benefits from longer training.`
Update README.md 2020-03-14 11:31:37 -03:00
Major refactor, small fixes 2021-01-18 07:35:02 -03:00			`\| Linear Classification \| Dataset \| Feature Extractor \| Architecture \| Feature dimensionality \| Projection Head dimensionality \| Epochs \| Top1 % \|`
Major refactor, small fixes 2021-01-18 07:33:12 -03:00			`\|----------------------------\|---------\|-------------------\|---------------------------------------------------------------------------------\|------------------------\|--------------------------------\|--------\|--------\|`
			`\| Logistic Regression (Adam) \| STL10 \| SimCLR \| [ResNet-18](https://drive.google.com/open?id=14_nH2FkyKbt61cieQDiSbBVNP8-gtwgF) \| 512 \| 128 \| 100 \| 70.45 \|`
			`\| Logistic Regression (Adam) \| CIFAR10 \| SimCLR \| [ResNet-18](https://drive.google.com/open?id=1lc2aoVtrAetGn0PnTkOyFzPCIucOJq7C) \| 512 \| 128 \| 100 \| 64.82 \|`
Update README.md 2021-01-18 07:39:39 -03:00			`\| Logistic Regression (Adam) \| STL10 \| SimCLR \| [ResNet-50](https://drive.google.com/open?id=1ByTKAUsdm_X7tLcii6oAEl5qFRqRMZSu) \| 2048 \| 128 \| 50 \| 67.075 \|`