Torchreid: Deep learning person re-identification in PyTorch.

computer-vision cross-domain deep-learning deep-neural-networks image-retrieval machine-learning metric-learning person-reid person-reidentification pytorch re-ranking

Go to file

KaiyangZhou 94941fe69a add 🐶		2018-07-02 14:04:51 +01:00
data_manager	update dataset	2018-07-02 12:55:39 +01:00
deprecated	update model & script	2018-07-02 10:17:14 +01:00
eval_lib	update model & script	2018-07-02 10:17:14 +01:00
models	restore models	2018-07-02 10:33:10 +01:00
utils	update model & script	2018-07-02 10:17:14 +01:00
.gitignore	update gitignore	2018-06-30 16:31:24 +01:00
BENCHMARK.md	add 🐶	2018-07-02 14:04:51 +01:00
DATASETS.md	add instructions on data preparation	2018-07-02 13:37:19 +01:00
LICENSE	Create LICENSE	2018-03-12 22:37:26 +00:00
README.md	update readme	2018-07-02 13:36:43 +01:00
dataset_loader.py	incorporate lmdb	2018-07-02 12:54:45 +01:00
eval_metrics.py	update model & script	2018-07-02 10:17:14 +01:00
losses.py	update model & script	2018-07-02 10:17:14 +01:00
optimizers.py	add amsgrad	2018-07-02 13:56:46 +01:00
requirements.txt	add requirements.txt	2018-07-02 11:54:23 +01:00
samplers.py	update sampler for triplet	2018-07-02 11:56:47 +01:00
train_imgreid_xent.py	update code	2018-07-02 13:57:11 +01:00
train_imgreid_xent_htri.py	update code	2018-07-02 13:57:11 +01:00
train_vidreid_xent.py	update code	2018-07-02 13:57:11 +01:00
train_vidreid_xent_htri.py	update code	2018-07-02 13:57:11 +01:00
transforms.py	update model & script	2018-07-02 10:17:14 +01:00

README.md

deep-person-reid

This repo contains PyTorch implementations of deep person re-identification models. It is developed for academic research.

We support

multi-GPU training.
both image-based and video-based reid.
unified interface for different reid models.
easy dataset preparation.
end-to-end training and evaluation.
standard dataset splits used by most papers.
download of trained models.
fast cython-based evaluation.

Get Started

cd to the folder where you want to download this repo.
Run git clone https://github.com/KaiyangZhou/deep-person-reid.
Install dependencies by pip install -r requirements.txt.
To accelerate evaluation (10x faster), you can use cython-based evaluation code (developed by luzai). First cd to eval_lib, then do make or python setup.py build_ext -i. After that, run python test_cython_eval.py to test if the package is successfully installed.

Instructions regarding how to prepare datasets can be found here.

Models

Currently, we have the following models:

models/resnet.py: ResNet50 [1], ResNet101 [1], ResNet50M [2].
models/resnext.py: ResNeXt101 [26].
models/seresnet.py: SEResNet50 [25], SEResNet101 [25], SEResNeXt50 [25], SEResNeXt101 [25].
models/densenet.py: DenseNet121 [3].
models/mudeep.py: MuDeep [10].
models/hacnn.py: HACNN [15].
models/squeezenet.py: SqueezeNet [18].
models/mobilenetv2.py: MobileNetV2 [19].
models/shufflenet.py: ShuffleNet [20].
models/xception.py: Xception [21].
models/inceptionv4.py: InceptionV4 [24].
models/inceptionresnetv2.py: InceptionResNetV2 [24].
models/dpn.py: DPN92 [27].

See models/__init__.py for details regarding what keys to use to call these models.

Benchmarks can be found here.

Train

Training codes are implemented mainly in

train_imgreid_xent.py: train image model with cross entropy loss.
train_imgreid_xent_htri.py: train image model with combination of cross entropy loss and hard triplet loss.
train_vidreid_xent.py: train video model with cross entropy loss.
train_vidreid_xent_htri.py: train video model with combination of cross entropy loss and hard triplet loss.

For example, to train an image reid model using ResNet50 and cross entropy loss, run

python train_imgreid_xent.py -d market1501 -a resnet50 --optim adam --lr 0.0003 --max-epoch 60 --stepsize 20 40 --train-batch 32 --test-batch 32 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0

To use multiple GPUs, you can set --gpu-devices 0,1,2,3.

Please run python train_blah_blah.py -h for more details regarding arguments.

Test

Say you have downloaded ResNet50 trained with xent on market1501. The path to this model is 'saved-models/resnet50_xent_market1501.pth.tar' (create a directory to store model weights mkdir saved-models/ beforehand). Then, run the following command to test

python train_imgreid_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 100 --gpu-devices 0

Likewise, to test video reid model, you should have a pretrained model saved under saved-models/, e.g. saved-models/resnet50_xent_mars.pth.tar, then run

python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2 --gpu-devices 0

Note that --test-batch in video reid represents number of tracklets. If you set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

Q&A

How do I set different learning rates to different components in my model?

A: Instead of giving model.parameters() to optimizer, you could pass an iterable of dicts, as described here. Please see the example below

# First comment the following code.
#optimizer = torch.optim.Adam(model.parameters(), lr=args.lr, weight_decay=args.weight_decay)
param_groups = [
  {'params': model.base.parameters(), 'lr': 0},
  {'params': model.classifier.parameters()},
]
# Such that model.base will be frozen and model.classifier will be trained with
# the default leanring rate, i.e. args.lr. This example code only applies to model
# that has two components (base and classifier). Modify the code to adapt to your model.
optimizer = torch.optim.Adam(param_groups, lr=args.lr, weight_decay=args.weight_decay)

Of course, you can pass model.classifier.parameters() to optimizer if you only need to train the classifier (in this case, setting the requires_grads wrt the base model params to false will be more efficient).

References

[1] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[2] Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.
[3] Huang et al. Densely Connected Convolutional Networks. CVPR 2017.
[4] Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
[5] Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.
[6] Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.
[7] Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
[8] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.
[9] Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016
[10] Qian et al. Multi-scale Deep Learning Architectures for Person Re-identification. ICCV 2017.
[11] Wang et al. Person Re-Identification by Video Ranking. ECCV 2014.
[12] Hirzer et al. Person Re-Identification by Descriptive and Discriminative Classification. SCIA 2011.
[13] Li et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification. CVPR 2014.
[14] Zhong et al. Re-ranking Person Re-identification with k-reciprocal Encoding. CVPR 2017
[15] Li et al. Harmonious Attention Network for Person Re-identification. CVPR 2018.
[16] Ristani et al. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. ECCVW 2016.
[17] Zheng et al. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro. ICCV 2017.
[18] Iandola et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv:1602.07360.
[19] Sandler et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. CVPR 2018.
[20] Zhang et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CVPR 2018.
[21] Chollet. Xception: Deep Learning with Depthwise Separable Convolutions. CVPR 2017.
[22] Wei et al. Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. CVPR 2018.
[23] Wu et al. Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning. CVPR 2018.
[24] Szegedy et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. ICLRW 2016.
[25] Hu et al. Squeeze-and-Excitation Networks. CVPR 2018.
[26] Xie et al. Aggregated Residual Transformations for Deep Neural Networks. CVPR 2017.
[27] Chen et al. Dual Path Networks. NIPS 2017.
[28] Gray et al. Evaluating appearance models for recognition, reacquisition, and tracking. PETS 2007.
[29] Loy et al. Multi-camera activity correlation analysis. CVPR 2009.
[30] Li et al. Human Reidentification with Transferred Metric Learning. ACCV 2012.
[31] Roth et al. Mahalanobis Distance Learning for Person Re-Identification. PR 2014.