Torchreid: Deep learning person re-identification in PyTorch.

computer-vision cross-domain deep-learning deep-neural-networks image-retrieval machine-learning metric-learning person-reid person-reidentification pytorch re-ranking

Go to file

KaiyangZhou c8616103e5 update readme		2018-03-12 22:31:39 +00:00
models	add xent+htri trainer	2018-03-12 13:53:08 +00:00
.gitignore	update readme	2018-03-12 15:04:04 +00:00
README.md	update readme	2018-03-12 22:31:39 +00:00
data_manager.py	update	2018-03-12 18:11:08 +00:00
dataset_loader.py	update dataset_loader	2018-03-12 18:39:28 +00:00
eval_metrics.py	update	2018-03-12 18:11:08 +00:00
losses.py	update	2018-03-12 18:11:08 +00:00
samplers.py	add xent+htri trainer	2018-03-12 13:53:08 +00:00
train_img_model_xent.py	update	2018-03-12 18:11:08 +00:00
train_img_model_xent_htri.py	update	2018-03-12 18:11:08 +00:00
train_vid_model_xent.py	update readme	2018-03-12 19:50:34 +00:00
train_vid_model_xent_htri.py	update readme	2018-03-12 19:50:34 +00:00
transforms.py	update	2018-03-12 18:11:08 +00:00
utils.py	update	2018-03-12 18:11:08 +00:00

README.md

deep-person-reid

This repo contains pytorch implementations of deep person re-identification approaches.

This repo will be actively maintained.

Install

cd to the folder where you want to download this repo.
run git clone https://github.com/KaiyangZhou/deep-person-reid.

Prepare data

Create a directory to store reid datasets under this repo via

cd deep-person-reid/
mkdir data/

Market1501 [7]:

download dataset to data/ from http://www.liangzheng.org/Project/project_reid.html.
extract dataset and rename to market1501.

MARS [8]:

create a directory named mars/ under data/.
download dataset to data/mars/ from http://www.liangzheng.com.cn/Project/project_mars.html.
extract bbox_train.zip and bbox_test.zip.
download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put info/ in data/mars. (we want to follow the standard split in [8])

Dataset loaders

These are implemented in dataset_loader.py where we have two main classes that subclass torch.utils.data.Dataset:

ImageDataset: processes image-based person reid datasets.
VideoDataset: processes video-based person reid datasets.

These two classes are used for torch.utils.data.DataLoader that can provide batched data.

Models

models/ResNet.py: ResNet50 [1], ResNet50M [2].
models/DenseNet.py: DenseNet121 [3].

Loss functions

xent: cross entropy + label smoothing regularizer [5].
htri: triplet loss with hard positive/negative mining [4] .

We use Adam [6] everywhere, which turned out to be the most effective optimizer in our experiments.

Train

Training codes are implemented mainly in

train_img_model_xent.py: train image model with cross entropy loss.
train_img_model_xent_htri.py: train image model with combination of cross entropy loss and hard triplet loss.
train_vid_model_xent.py: train video model with cross entropy loss.
train_vid_model_xent_htri.py: train video model with combination of cross entropy loss and hard triplet loss.

For example, to train an image reid model using ResNet50 and cross entropy loss, run

python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0

Please run python train_blah_blah.py -h for more details regarding arguments.

Results

Image person reid

Market1501

Model	Size (M)	Loss	Rank-1/5/10 (%)	mAP (%)	Model weights	Published Rank	Published mAP
DenseNet121	7.72	xent	86.5/93.6/95.7	67.8	download
DenseNet121	7.72	xent+htri	89.5/96.3/97.5	72.6	download
ResNet50	25.05	xent	85.4/94.1/95.9	68.8	download	87.3/-/-	67.6
ResNet50	25.05	xent+htri	87.5/95.3/97.3	72.3	download
ResNet50M	30.01	xent	89.0/95.5/97.3	75.0	download	89.9/-/-	75.6
ResNet50M	30.01	xent+htri	90.4/96.7/98.0	76.6	download

Video person reid

MARS

Model	Size (M)	Loss	Rank-1/5/10 (%)	mAP (%)	Model weights
ResNet50	24.79	xent	74.5/88.8/91.8	64.0	download
ResNet50	24.79	xent+htri	80.8/92.1/94.3	74.0	download
ResNet50M	29.63	xent	77.8/89.8/92.8	67.5	download
ResNet50M	29.63	xent+htri	82.3/93.8/95.3	75.4	download

Test

Say you have downloaded ResNet50 trained with xent on market1501. The path to this model is 'saved-models/resnet50_xent_market1501.pth.tar' (create a directory to store model weights mkdir saved-models/). Then, run the following command to test

python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 32

Likewise, to test video reid model, you should have a pretrained model saved under saved-models/, e.g. saved-models/resnet50_xent_mars.pth.tar, then run

python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2

Note that --test-batch in video reid represents number of tracklets. If we set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

References

[1] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[2] Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.
[3] Huang et al. Densely Connected Convolutional Networks. CVPR 2017.
[4] Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
[5] Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.
[6] Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.
[7] Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
[8] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.