deep-person-reid/README.md

# deep-person-reid
This repo contains [pytorch](http://pytorch.org/) implementations of deep person re-identification models.

Pretrained models are available.

We will actively maintain this repo to incorporate new models.

## Install
1. `cd` to the folder where you want to download this repo.
2. run `git clone https://github.com/KaiyangZhou/deep-person-reid`.

## Prepare data
Create a directory to store reid datasets under this repo via
```bash
cd deep-person-reid/
mkdir data/
```

Market1501 [7]:
1. download dataset to `data/` from http://www.liangzheng.org/Project/project_reid.html.
2. extract dataset and rename to `market1501`.

MARS [8]:
1. create a directory named `mars/` under `data/`.
2. download dataset to `data/mars/` from http://www.liangzheng.com.cn/Project/project_mars.html.
3. extract `bbox_train.zip` and `bbox_test.zip`.
4. download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put `info/` in `data/mars`. (we want to follow the standard split in [8])

## Dataset loaders
These are implemented in `dataset_loader.py` where we have two main classes that subclass [torch.utils.data.Dataset](http://pytorch.org/docs/master/_modules/torch/utils/data/dataset.html#Dataset):
* `ImageDataset`: processes image-based person reid datasets.
* `VideoDataset`: processes video-based person reid datasets.

These two classes are used for [torch.utils.data.DataLoader](http://pytorch.org/docs/master/_modules/torch/utils/data/dataloader.html#DataLoader) that can provide batched data.

## Models
* `models/ResNet.py`: ResNet50 [1], ResNet50M [2].
* `models/DenseNet.py`: DenseNet121 [3].

## Loss functions
* `xent`: cross entropy + label smoothing regularizer [5].
* `htri`: triplet loss with hard positive/negative mining [4] .

We use `Adam` [6] everywhere, which turned out to be the most effective optimizer in our experiments.

## Train
Training codes are implemented mainly in
* `train_img_model_xent.py`: train image model with cross entropy loss.
* `train_img_model_xent_htri.py`: train image model with combination of cross entropy loss and hard triplet loss.
* `train_vid_model_xent.py`: train video model with cross entropy loss.
* `train_vid_model_xent_htri.py`: train video model with combination of cross entropy loss and hard triplet loss.

For example, to train an image reid model using ResNet50 and cross entropy loss, run
```bash
python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0
```

Then, you will see
```bash
==========
Args:Namespace(arch='resnet50', dataset='market1501', eval_step=20, evaluate=False, gamma=0.1, gpu_devices='0', height=256, lr=0.0003, max_epoch=60, print_freq=10, resume='', save_dir='log/resnet50/', seed=1, start_epoch=0, stepsize=20, test_batch=32, train_batch=32, use_cpu=False, weight_decay=0.0005, width=128, workers=4)
==========
Currently using GPU
Initializing dataset market1501
=> Market1501 loaded
Dataset statistics:
  ------------------------------
  subset   | # ids | # images
  ------------------------------
  train    |   751 |    12936
  query    |   750 |     3368
  gallery  |   751 |    15913
  ------------------------------
  total    |  1501 |    32217
  ------------------------------
Initializing model: resnet50
Model size: 25.04683M
==> Epoch 1/60
Batch 10/404     Loss 6.665115 (6.781841)
Batch 20/404     Loss 6.792669 (6.837275)
Batch 30/404     Loss 6.592124 (6.806587)
... ...
==> Epoch 60/60
Batch 10/404     Loss 1.101616 (1.075387)
Batch 20/404     Loss 1.055073 (1.075455)
Batch 30/404     Loss 1.081339 (1.073036)
... ...
==> Test
Extracted features for query set, obtained 3368-by-2048 matrix
Extracted features for gallery set, obtained 15913-by-2048 matrix
Computing distance matrix
Computing CMC and mAP
Results ----------
mAP: 68.8%
CMC curve
Rank-1  : 85.4%
Rank-5  : 94.1%
Rank-10 : 95.9%
Rank-20 : 97.2%
------------------
Finished. Total elapsed time (h:m:s): 1:57:44
```

Please run `python train_blah_blah.py -h` for more details regarding arguments.

## Results
### Image person reid
#### Market1501

| Model | Size (M) | Loss | Rank-1/5/10 (%) | mAP (%) | Model weights | Published Rank | Published mAP |
| --- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| DenseNet121 | 7.72 | xent | 86.5/93.6/95.7 | 67.8 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/densenet121_xent_market1501.pth.tar) | | |
| DenseNet121 | 7.72 | xent+htri | 89.5/96.3/97.5 | 72.6 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/densenet121_xent_htri_market1501.pth.tar) | | |
| ResNet50 | 25.05 | xent | 85.4/94.1/95.9 | 68.8 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50_xent_market1501.pth.tar) | 87.3/-/- | 67.6 |
| ResNet50 | 25.05 | xent+htri | 87.5/95.3/97.3 | 72.3 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50_xent_htri_market1501.pth.tar) | | |
| ResNet50M | 30.01 | xent | 89.0/95.5/97.3 | 75.0 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50m_xent_market1501.pth.tar) | 89.9/-/- | 75.6 |
| ResNet50M | 30.01 | xent+htri | 90.4/96.7/98.0 | 76.6 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50m_xent_htri_market1501.pth.tar) | | |

### Video person reid
#### MARS

| Model | Size (M) | Loss | Rank-1/5/10 (%) | mAP (%) | Model weights | Published Rank | Published mAP |
| --- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ResNet50 | 24.79 | xent | 74.5/88.8/91.8 | 64.0 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50_xent_mars.pth.tar) | | |
| ResNet50 | 24.79 | xent+htri | 80.8/92.1/94.3 | 74.0 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50_xent_htri_mars.pth.tar) | | |
| ResNet50M | 29.63 | xent | 77.8/89.8/92.8 | 67.5 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50m_xent_mars.pth.tar) | | |
| ResNet50M | 29.63 | xent+htri | 82.3/93.8/95.3 | 75.4 | [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50m_xent_htri_mars.pth.tar) | | |

## Test
Say you have downloaded ResNet50 trained with `xent` on `market1501`. The path to this model is  `'saved-models/resnet50_xent_market1501.pth.tar'` (create a directory to store model weights `mkdir saved-models/`). Then, run the following command to test
```bash
python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 32
```

Likewise, to test video reid model, you should have a pretrained model saved under `saved-models/`, e.g. `saved-models/resnet50_xent_mars.pth.tar`, then run
```bash
python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2
```
Note that `--test-batch` in video reid represents number of tracklets. If we set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

## References
[1] [He et al. Deep Residual Learning for Image Recognition. CVPR 2016.](https://arxiv.org/abs/1512.03385)<br />
[2] [Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.](https://arxiv.org/abs/1711.08106) <br />
[3] [Huang et al. Densely Connected Convolutional Networks. CVPR 2017.](https://arxiv.org/abs/1608.06993) <br />
[4] [Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.](https://arxiv.org/abs/1703.07737) <br />
[5] [Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.](https://arxiv.org/abs/1512.00567) <br />
[6] [Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.](https://arxiv.org/abs/1412.6980) <br />
[7] [Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Zheng_Scalable_Person_Re-Identification_ICCV_2015_paper.pdf) <br />
[8] [Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.](http://www.liangzheng.com.cn/Project/project_mars.html) <br />
import torch in utils 2018-03-12 18:29:35 +08:00			`# deep-person-reid`
update readme 2018-03-14 04:46:36 +08:00			`This repo contains [pytorch](http://pytorch.org/) implementations of deep person re-identification models.`
import torch in utils 2018-03-12 18:29:35 +08:00
update readme 2018-03-14 04:46:36 +08:00			`Pretrained models are available.`

			`We will actively maintain this repo to incorporate new models.`
import torch in utils 2018-03-12 18:29:35 +08:00
update readme for video trainer 2018-03-12 22:20:51 +08:00			`## Install`
update readme for video trainer 2018-03-12 22:21:14 +08:00			1. `cd` to the folder where you want to download this repo.
			2. run `git clone https://github.com/KaiyangZhou/deep-person-reid`.
update readme for video trainer 2018-03-12 22:20:51 +08:00
update readme 2018-03-12 18:56:27 +08:00			`## Prepare data`
update readme for video trainer 2018-03-12 22:22:20 +08:00			`Create a directory to store reid datasets under this repo via`
update readme 2018-03-13 06:31:39 +08:00			```bash
update readme 2018-03-12 23:04:26 +08:00			`cd deep-person-reid/`
update readme for video trainer 2018-03-12 22:22:20 +08:00			`mkdir data/`
			```
update readme 2018-03-12 19:25:37 +08:00
			`Market1501 [7]:`
			1. download dataset to `data/` from http://www.liangzheng.org/Project/project_reid.html.
			2. extract dataset and rename to `market1501`.

			`MARS [8]:`
update readme 2018-03-12 19:28:11 +08:00			1. create a directory named `mars/` under `data/`.
update readme 2018-03-12 19:25:37 +08:00			2. download dataset to `data/mars/` from http://www.liangzheng.com.cn/Project/project_mars.html.
			3. extract `bbox_train.zip` and `bbox_test.zip`.
update readme 2018-03-12 19:26:39 +08:00			4. download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put `info/` in `data/mars`. (we want to follow the standard split in [8])
update readme for video trainer 2018-03-12 22:25:31 +08:00
update readme 2018-03-13 06:23:50 +08:00			`## Dataset loaders`
			These are implemented in `dataset_loader.py` where we have two main classes that subclass [torch.utils.data.Dataset](http://pytorch.org/docs/master/_modules/torch/utils/data/dataset.html#Dataset):
			* `ImageDataset`: processes image-based person reid datasets.
			* `VideoDataset`: processes video-based person reid datasets.

			`These two classes are used for [torch.utils.data.DataLoader](http://pytorch.org/docs/master/_modules/torch/utils/data/dataloader.html#DataLoader) that can provide batched data.`

update readme for video trainer 2018-03-12 22:33:52 +08:00			`## Models`
			* `models/ResNet.py`: ResNet50 [1], ResNet50M [2].
			* `models/DenseNet.py`: DenseNet121 [3].

update readme 2018-03-13 06:23:50 +08:00			`## Loss functions`
			* `xent`: cross entropy + label smoothing regularizer [5].
			* `htri`: triplet loss with hard positive/negative mining [4] .

update readme 2018-03-13 06:27:41 +08:00			We use `Adam` [6] everywhere, which turned out to be the most effective optimizer in our experiments.

update readme 2018-03-12 18:56:27 +08:00			`## Train`
update readme 2018-03-13 06:23:50 +08:00			`Training codes are implemented mainly in`
update readme for video trainer 2018-03-12 22:25:31 +08:00			* `train_img_model_xent.py`: train image model with cross entropy loss.
			* `train_img_model_xent_htri.py`: train image model with combination of cross entropy loss and hard triplet loss.
			* `train_vid_model_xent.py`: train video model with cross entropy loss.
			* `train_vid_model_xent_htri.py`: train video model with combination of cross entropy loss and hard triplet loss.

update readme for video trainer 2018-03-12 22:33:52 +08:00			`For example, to train an image reid model using ResNet50 and cross entropy loss, run`
update readme 2018-03-13 06:31:39 +08:00			```bash
update readme 2018-03-13 06:23:50 +08:00			`python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0`
update readme for video trainer 2018-03-12 22:33:52 +08:00			```

update readme 2018-03-14 20:29:38 +08:00			`Then, you will see`
			```bash
			`==========`
			`Args:Namespace(arch='resnet50', dataset='market1501', eval_step=20, evaluate=False, gamma=0.1, gpu_devices='0', height=256, lr=0.0003, max_epoch=60, print_freq=10, resume='', save_dir='log/resnet50/', seed=1, start_epoch=0, stepsize=20, test_batch=32, train_batch=32, use_cpu=False, weight_decay=0.0005, width=128, workers=4)`
			`==========`
			`Currently using GPU`
			`Initializing dataset market1501`
			`=> Market1501 loaded`
			`Dataset statistics:`
			`------------------------------`
			`subset \| # ids \| # images`
			`------------------------------`
			`train \| 751 \| 12936`
			`query \| 750 \| 3368`
			`gallery \| 751 \| 15913`
			`------------------------------`
			`total \| 1501 \| 32217`
			`------------------------------`
			`Initializing model: resnet50`
			`Model size: 25.04683M`
			`==> Epoch 1/60`
			`Batch 10/404 Loss 6.665115 (6.781841)`
			`Batch 20/404 Loss 6.792669 (6.837275)`
			`Batch 30/404 Loss 6.592124 (6.806587)`
			`... ...`
			`==> Epoch 60/60`
			`Batch 10/404 Loss 1.101616 (1.075387)`
			`Batch 20/404 Loss 1.055073 (1.075455)`
			`Batch 30/404 Loss 1.081339 (1.073036)`
			`... ...`
			`==> Test`
			`Extracted features for query set, obtained 3368-by-2048 matrix`
			`Extracted features for gallery set, obtained 15913-by-2048 matrix`
			`Computing distance matrix`
			`Computing CMC and mAP`
			`Results ----------`
			`mAP: 68.8%`
			`CMC curve`
			`Rank-1 : 85.4%`
			`Rank-5 : 94.1%`
			`Rank-10 : 95.9%`
			`Rank-20 : 97.2%`
			`------------------`
			`Finished. Total elapsed time (h:m:s): 1:57:44`
			```

update readme 2018-03-13 06:23:50 +08:00			Please run `python train_blah_blah.py -h` for more details regarding arguments.
update readme 2018-03-12 19:04:39 +08:00
update readme 2018-03-13 06:23:50 +08:00			`## Results`
update readme 2018-03-12 20:12:49 +08:00			`### Image person reid`
			`#### Market1501`
import torch in utils 2018-03-12 18:29:35 +08:00
update readme & dataset_loader 2018-03-13 02:38:12 +08:00			`\| Model \| Size (M) \| Loss \| Rank-1/5/10 (%) \| mAP (%) \| Model weights \| Published Rank \| Published mAP \|`
update readme 2018-03-12 23:24:05 +08:00			`\| --- \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \|`
update readme & dataset_loader 2018-03-13 02:38:12 +08:00			`\| DenseNet121 \| 7.72 \| xent \| 86.5/93.6/95.7 \| 67.8 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/densenet121_xent_market1501.pth.tar) \| \| \|`
			`\| DenseNet121 \| 7.72 \| xent+htri \| 89.5/96.3/97.5 \| 72.6 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/densenet121_xent_htri_market1501.pth.tar) \| \| \|`
			`\| ResNet50 \| 25.05 \| xent \| 85.4/94.1/95.9 \| 68.8 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50_xent_market1501.pth.tar) \| 87.3/-/- \| 67.6 \|`
			`\| ResNet50 \| 25.05 \| xent+htri \| 87.5/95.3/97.3 \| 72.3 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50_xent_htri_market1501.pth.tar) \| \| \|`
			`\| ResNet50M \| 30.01 \| xent \| 89.0/95.5/97.3 \| 75.0 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50m_xent_market1501.pth.tar) \| 89.9/-/- \| 75.6 \|`
update readme 2018-03-13 03:01:01 +08:00			`\| ResNet50M \| 30.01 \| xent+htri \| 90.4/96.7/98.0 \| 76.6 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/image-models/resnet50m_xent_htri_market1501.pth.tar) \| \| \|`
import torch in utils 2018-03-12 18:29:35 +08:00
update readme 2018-03-12 20:12:49 +08:00			`### Video person reid`
			`#### MARS`

update readme & modify train-vid-xent 2018-03-13 03:18:39 +08:00			`\| Model \| Size (M) \| Loss \| Rank-1/5/10 (%) \| mAP (%) \| Model weights \| Published Rank \| Published mAP \|`
			`\| --- \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \|`
update readme 2018-03-13 04:33:32 +08:00			`\| ResNet50 \| 24.79 \| xent \| 74.5/88.8/91.8 \| 64.0 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50_xent_mars.pth.tar) \| \| \|`
update readme 2018-03-13 03:50:34 +08:00			`\| ResNet50 \| 24.79 \| xent+htri \| 80.8/92.1/94.3 \| 74.0 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50_xent_htri_mars.pth.tar) \| \| \|`
update readme 2018-03-13 04:33:32 +08:00			`\| ResNet50M \| 29.63 \| xent \| 77.8/89.8/92.8 \| 67.5 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50m_xent_mars.pth.tar) \| \| \|`
update readme 2018-03-13 03:50:34 +08:00			`\| ResNet50M \| 29.63 \| xent+htri \| 82.3/93.8/95.3 \| 75.4 \| [download](http://www.eecs.qmul.ac.uk/~kz303/deep-person-reid/model-zoo/video-models/resnet50m_xent_htri_mars.pth.tar) \| \| \|`
update readme & modify train-vid-xent 2018-03-13 03:18:39 +08:00
update readme 2018-03-13 03:01:01 +08:00			`## Test`
update readme 2018-03-13 06:23:50 +08:00			Say you have downloaded ResNet50 trained with `xent` on `market1501`. The path to this model is `'saved-models/resnet50_xent_market1501.pth.tar'` (create a directory to store model weights `mkdir saved-models/`). Then, run the following command to test
update readme 2018-03-13 06:31:39 +08:00			```bash
update readme 2018-03-13 06:23:50 +08:00			`python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 32`
update readme 2018-03-12 23:04:26 +08:00			```
update readme 2018-03-12 23:04:04 +08:00
update readme 2018-03-13 06:23:50 +08:00			Likewise, to test video reid model, you should have a pretrained model saved under `saved-models/`, e.g. `saved-models/resnet50_xent_mars.pth.tar`, then run
update readme 2018-03-13 06:31:39 +08:00			```bash
update readme 2018-03-13 06:23:50 +08:00			`python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2`
			```
			Note that `--test-batch` in video reid represents number of tracklets. If we set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

update readme 2018-03-12 18:56:27 +08:00			`## References`
update readme 2018-03-12 18:59:35 +08:00			`[1] [He et al. Deep Residual Learning for Image Recognition. CVPR 2016.](https://arxiv.org/abs/1512.03385)<br />`
			`[2] [Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.](https://arxiv.org/abs/1711.08106) <br />`
			`[3] [Huang et al. Densely Connected Convolutional Networks. CVPR 2017.](https://arxiv.org/abs/1608.06993) <br />`
			`[4] [Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.](https://arxiv.org/abs/1703.07737) <br />`
update readme 2018-03-12 19:06:31 +08:00			`[5] [Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.](https://arxiv.org/abs/1512.00567) <br />`
update readme 2018-03-12 19:25:37 +08:00			`[6] [Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.](https://arxiv.org/abs/1412.6980) <br />`
			`[7] [Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Zheng_Scalable_Person_Re-Identification_ICCV_2015_paper.pdf) <br />`
			`[8] [Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.](http://www.liangzheng.com.cn/Project/project_mars.html) <br />`