deep-person-reid/docs/user_guide.rst

252 lines
8.7 KiB
ReStructuredText
Raw Normal View History

2019-03-24 07:09:39 +08:00
How-to
============
.. contents::
:local:
2019-03-24 23:52:39 +08:00
Find model keys
-----------------
Keys are listed under the *Public keys* section within each model class in :ref:`torchreid_models`.
Show available models
----------------------
.. code-block:: python
import torchreid
torchreid.models.show_avai_models()
Change the training sampler
-----------------------------
The default ``train_sampler`` is "RandomSampler". You can give the specific sampler name as input to ``train_sampler``, e.g. ``train_sampler='RandomIdentitySampler'`` for triplet loss.
Choose an optimizer/lr_scheduler
----------------------------------
Please refer to the source code of ``build_optimizer``/``build_lr_scheduler`` in :ref:`torchreid_optim` for details.
2019-03-24 07:09:39 +08:00
Resume training
----------------
Suppose the checkpoint is saved in "log/resnet50/model.pth.tar-30", you can do
.. code-block:: python
start_epoch = torchreid.utils.resume_from_checkpoint(
'log/resnet50/model.pth.tar-30',
model,
optimizer
)
engine.run(
save_dir='log/resnet50',
max_epoch=60,
start_epoch=start_epoch
)
Combine multiple datasets
---------------------------
Easy. Just give whatever datasets (keys) you want to the ``sources`` argument when instantiating a data manager. For example,
.. code-block:: python
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
sources=['market1501', 'dukemtmcreid', 'cuhk03', 'msmt17'],
height=256,
width=128,
batch_size=32
)
In this example, the target datasets are Market1501, DukeMTMC-reID, CUHK03 and MSMT17 as the ``targets`` argument is not specified. Please refer to ``Engine.test()`` in :ref:`torchreid_engine` for details regarding how evaluation is performed.
Do cross-dataset evaluation
-----------------------------
Easy. Just give whatever datasets (keys) you want to the argument ``targets``, like
.. code-block:: python
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
sources='market1501',
targets='dukemtmcreid', # or targets='cuhk03' or targets=['dukemtmcreid', 'cuhk03']
height=256,
width=128,
batch_size=32
)
Combine train, query and gallery
---------------------------------
This can be easily done by setting ``combineall=True`` when instantiating a data manager. Below is an example of using Market1501,
.. code-block:: python
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
sources='market1501',
height=256,
width=128,
batch_size=32,
market1501_500k=False,
combineall=True # it's me, here
)
More specifically, with ``combineall=False``, you would get
.. code-block:: none
=> Loaded Market1501
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 751 | 12936 | 6
query | 750 | 3368 | 6
gallery | 751 | 15913 | 6
---------------------------------------
with ``combineall=True``, you would get
.. code-block:: none
=> Loaded Market1501
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 1501 | 29419 | 6
query | 750 | 3368 | 6
gallery | 751 | 15913 | 6
---------------------------------------
Optimize layers with different learning rates
-----------------------------------------------
A common practice for fine-tuning pretrained models is to use a smaller learning rate for base layers and a large learning rate for randomly initialized layers (referred to as ``new_layers``). ``torchreid.optim.optimizer`` has implemented such feature. What you need to do is to set ``staged_lr=True`` and give the names of ``new_layers`` such as "classifier".
Below is an example of setting different learning rates for base layers and new layers in ResNet50,
.. code-block:: python
# New layer "classifier" has a learning rate of 0.01
# The base layers have a learning rate of 0.001
optimizer = torchreid.optim.build_optimizer(
model,
optim='sgd',
lr=0.01,
staged_lr=True,
new_layers='classifier',
base_lr_mult=0.1
)
Please refer to :ref:`torchreid_optim` for more details.
Do two-stepped transfer learning
-------------------------------------
To prevent the pretrained layers to be damaged by harmful gradients back-propagated from randomly initialized layers, one can adopt the *two-stepped transfer learning strategy* presented in `Deep Transfer Learning for Person Re-identification <https://arxiv.org/abs/1611.05244>`_. The basic idea is to pretrain the randomly initialized layers for few epochs while keeping the base layers frozen before training all layers end-to-end.
This has been implemented in ``Engine.run()`` (see :ref:`torchreid_engine`). The arguments to enable this feature are ``fixbase_epoch`` and ``open_layers``. Intuitively, ``fixbase_epoch`` denotes the number of epochs to keep the base layers frozen; ``open_layers`` means which layers are open for training. Note that ``fixbase_epoch`` is not counted into ``max_epoch``.
For example, say you want to pretrain the classification layer named "classifier" in ResNet50 for 5 epochs before training all layers, you can do
.. code-block:: python
engine.run(
save_dir='log/resnet50',
max_epoch=60,
eval_freq=10,
print_freq=10,
test_only=False,
fixbase_epoch=5,
open_layers='classifier'
)
# or open_layers=['fc', 'classifier'] if there is another fc layer that
# is randomly initialized
Test a trained model
----------------------
You can load a trained model using :code:`torchreid.utils.load_pretrained_weights(model, weight_path)` and set ``test_only=True`` in ``engine.run()``.
Visualize ranked results
-------------------------
Ranked images can be visualized by setting the ``visrank`` to True in ``engine.run()``. ``visrank_topk`` determines the top-k images to be visualized (Default is ``visrank_topk=20``). Typically, ``visrank`` is used in test mode, i.e. setting ``test_only=True`` in ``engine.run()``. The images are saved under ``osp.join(save_dir, 'visrank-'+str(epoch+1), dataset_name``.
.. image:: figures/ranked_results.jpg
:width: 600px
:align: center
Use your own dataset
-----------------------
1. Write your own dataset class. Below is a template for image dataset. However, it can also be applied to a video dataset class, for which you simply change ``ImageDataset`` to ``VideoDataset``.
.. code-block:: python
from __future__ import absolute_import
from __future__ import print_function
from __future__ import division
import sys
import os
import os.path as osp
from torchreid.data import ImageDataset
2019-03-25 07:33:54 +08:00
class NewDataset(ImageDataset):
dataset_dir = 'new_dataset'
2019-03-24 07:09:39 +08:00
def __init__(self, root='', **kwargs):
self.root = osp.abspath(osp.expanduser(root))
self.dataset_dir = osp.join(self.root, self.dataset_dir)
# All you need to do here is to generate three lists,
# which are train, query and gallery.
# Each list contains tuples of (img_path, pid, camid).
train = ...
query = ...
gallery = ...
2019-03-25 07:33:54 +08:00
super(NewDataset, self).__init__(train, query, gallery, **kwargs)
2019-03-24 07:09:39 +08:00
2. Register your dataset.
.. code-block:: python
import torchreid
2019-03-25 07:33:54 +08:00
torchreid.data.register_image_dataset('new_dataset', NewDataset)
2019-03-24 07:09:39 +08:00
3. Initialize a data manager with your dataset.
.. code-block:: python
# use your own dataset only
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
2019-03-25 07:33:54 +08:00
sources='new_dataset'
2019-03-24 07:09:39 +08:00
)
# combine with other datasets
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
2019-03-25 07:33:54 +08:00
sources=['new_dataset', 'dukemtmcreid']
2019-03-24 07:09:39 +08:00
)
# cross-dataset evaluation
datamanager = torchreid.data.ImageDataManager(
root='reid-data',
2019-03-25 07:33:54 +08:00
sources=['new_dataset', 'dukemtmcreid'],
2019-03-24 07:09:39 +08:00
targets='market1501' # or targets=['market1501', 'cuhk03']
)
Design your own Engine
------------------------
A new Engine should be designed if you have your own loss function. The base Engine class ``torchreid.engine.Engine`` has implemented some generic methods which you want to inherit to avoid re-writing. Please refer to the source code for more details. You are suggested to see how ``ImageSoftmaxEngine`` and ``ImageTripletEngine`` are constructed (also ``VideoSoftmaxEngine`` and ``VideoTripletEngine``). All you need to implement might be just a ``train()`` function.