mirror of https://github.com/PyRetri/PyRetri.git
322 lines
9.0 KiB
Markdown
322 lines
9.0 KiB
Markdown
# Getting started
|
||
|
||
This page provides basic tutorials about the usage of PyRetri. For installation instructions and dataset preparation, please see [INSTALL.md] (INSTALL.md).
|
||
|
||
## Make Data Json
|
||
|
||
After the gallery set and query set are separated, we package the information of each subset in pickle format for further process. We use different types to package different structured folders: `general`, `oxford` and `reid`.
|
||
|
||
The general object recognition dataset collects images with the same label in one directory and the folder structure should be like this:
|
||
|
||
```shell
|
||
# type: general
|
||
general_recognition
|
||
├── class A
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
├── class B
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
└── ···
|
||
```
|
||
|
||
Oxford5k is a typical dataset in image retrieval field and the folder structure is as follows:
|
||
|
||
```shell
|
||
# type: oxford
|
||
oxford
|
||
├── gt
|
||
│ ├── XXX.txt
|
||
│ └── ···
|
||
└── images
|
||
├── XXX.jpg
|
||
└── ···
|
||
|
||
```
|
||
|
||
The person re-identification dataset have already split the query set and gallery set, its folder structure should be like this:
|
||
|
||
```shell
|
||
# type: reid
|
||
person_re_identification
|
||
├── bounding_box_test
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
├── query
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
└── ···
|
||
```
|
||
|
||
Choosing the mode carefully, you can generate data jsons by:
|
||
|
||
```shell
|
||
python3 main/make_data_json.py [-d ${dataset}] [-sp ${save_path}] [-t ${type}] [-gt ${ground_truth}]
|
||
```
|
||
|
||
Auguments:
|
||
|
||
- `data`: Path of the dataset for generating data json file.
|
||
- `save_path`: Path for saving the output file.
|
||
- `type`: Type of the dataset collecting images. For dataset collecting images with the same label in one directory, we use `general`. For oxford/paris dataset, we use `oxford`. For re-id dataset, we use `reid`.
|
||
- `ground_truth`: Path of the gt information, which is necessary for generating data json file of oxford/paris dataset.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
# for dataset collecting images with the same label in one directory
|
||
python3 main/make_data_json.py -d /data/caltech101/gallery/ -sp data_jsons/caltech_gallery.json -t general
|
||
|
||
python3 main/make_data_json.py -d /data/caltech101/query/ -sp data_jsons/caltech_query.json -t general
|
||
|
||
# for oxford/paris dataset
|
||
python3 main/make_data_json.py -d /data/cbir/oxford/gallery/ -sp data_jsons/oxford_gallery.json -t oxford -gt /data/cbir/oxford/gt/
|
||
|
||
python3 main/make_data_json.py -d /data/cbir/oxford/query/ -sp data_jsons/oxford_query.json -t oxford -gt /data/cbir/oxford/gt/
|
||
|
||
# for re-id dataset
|
||
python3 main/make_data_json.py -d /data/market1501/bounding_box_test/ -sp data_jsons/market_gallery.json -t reid
|
||
|
||
python3 main/make_data_json.py -d /data/market1501/query/ -sp data_jsons/market_query.json -t reid
|
||
```
|
||
|
||
Note: Oxford/Paris dataset contains the ground truth of each query image in a txt file, so remember to give the path of gt file when generating data json file of Oxford/Paris.
|
||
|
||
## Extract
|
||
|
||
All outputs (features and labels) will be saved to the save directory in pickle format.
|
||
|
||
Extract feature for each data json file by:
|
||
|
||
```shell
|
||
python3 main/extract_feature.py [-dj ${data_json}] [-sp ${save_path}] [-cfg ${config_file}] [-si ${save_interval}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `data_json`: Path of the data json file to be extrated.
|
||
- `save_path`: Path for saving the output features in pickle format.
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
- `save_interval`: Optional. It is the number of features saved in one part file, which is set to 5000 by default.
|
||
|
||
```shell
|
||
# extract features of gallert set and query set
|
||
python3 main/extract_feature.py -dj data_jsons/caltech_gallery.json -sp /data/features/caltech/gallery/ -cfg configs/caltech.yaml
|
||
|
||
python3 main/extract_feature.py -dj data_jsons/caltech_query.json -sp /data/features/caltech/query/ -cfg configs/caltech.yaml
|
||
```
|
||
|
||
## Index
|
||
|
||
The path of query set features and gallery set features is specified in the config file.
|
||
|
||
Index the query set features by:
|
||
|
||
```shell
|
||
python3 main/index.py [-cfg ${config_file}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 main/index.py -cfg configs/caltech.yaml
|
||
```
|
||
|
||
## Single Index
|
||
|
||
For visulization results and wrong case analysis, we provide the script for single query image and you can visualize or save the retrieval results easily.
|
||
|
||
Use this command to single index:
|
||
|
||
```shell
|
||
python3 main/single_index.py [-cfg ${config_file}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 main/single_index.py -cfg configs/caltech.yaml
|
||
```
|
||
|
||
Please see single_index.py for more details.
|
||
|
||
## Add Your Own Module
|
||
|
||
We basically categorize retrieval process into 4 components.
|
||
|
||
- model: the pre-trained model for feature extraction.
|
||
- extract: assign which layer to output, including splitter functions and aggregation methods.
|
||
- index: index features, including dimension process, feature enhance, distance metric and re-rank.
|
||
- evaluate: evaluate retrieval results, outputting recall and mAP results.
|
||
|
||
Here we show how to add your own model to extract features.
|
||
|
||
1. Create your model file `pyretri/models/backbone/backbone_impl/reid_baseline.py`.
|
||
|
||
```shell
|
||
import torch.nn as nn
|
||
|
||
from ..backbone_base import BackboneBase
|
||
from ...registry import BACKBONES
|
||
|
||
@BACKBONES.register
|
||
class ft_net(BackboneBase):
|
||
def __init__(self):
|
||
pass
|
||
|
||
def forward(self, x):
|
||
pass
|
||
```
|
||
|
||
or
|
||
|
||
```shell
|
||
import torch.nn as nn
|
||
|
||
from ..backbone_base import BackboneBase
|
||
from ...registry import BACKBONES
|
||
|
||
class FT_NET(BackboneBase):
|
||
def __init__(self):
|
||
pass
|
||
|
||
def forward(self, x):
|
||
pass
|
||
|
||
@BACKBONES.register
|
||
def ft_net():
|
||
model = FT_NET()
|
||
return model
|
||
```
|
||
|
||
2. Import the module in `pyretri/models/backbone/__init__.py`.
|
||
|
||
```shell
|
||
from .backbone_impl.reid_baseline import ft_net
|
||
|
||
__all__ = [
|
||
'ft_net',
|
||
]
|
||
```
|
||
|
||
3. Use it in your config file.
|
||
|
||
```shell
|
||
model:
|
||
name: "ft_net"
|
||
ft_net:
|
||
load_checkpoint: "/data/my_model_zoo/res50_market1501.pth"
|
||
```
|
||
|
||
## Pipeline Combinations Search
|
||
|
||
Since tricks used in each stage have a signicant impact on retrieval performance, we present the pipeline combinations search scripts to help users to find possible combinations of approaches with various hyper-parameters.
|
||
|
||
### Get into the combinations search scripts
|
||
|
||
```shell
|
||
cd search/
|
||
```
|
||
|
||
### Define Search Space
|
||
|
||
We decompose the search space into three sub search spaces: pre_process, extract and index, each of which corresponds to a specified file. Search space is defined by adding methods with hyper-parameters to a specified dict. You can add a search operator as follows:
|
||
|
||
```shell
|
||
pre_processes.add(
|
||
"PadResize224",
|
||
{
|
||
"batch_size": 32,
|
||
"folder": {
|
||
"name": "Folder"
|
||
},
|
||
"collate_fn": {
|
||
"name": "CollateFn"
|
||
},
|
||
"transformers": {
|
||
"names": ["PadResize", "ToTensor", "Normalize"],
|
||
"PadResize": {
|
||
"size": 224,
|
||
"padding_v": [124, 116, 104]
|
||
},
|
||
"Normalize": {
|
||
"mean": [0.485, 0.456, 0.406],
|
||
"std": [0.229, 0.224, 0.225]
|
||
}
|
||
}
|
||
}
|
||
)
|
||
```
|
||
|
||
By doing this, a pre_process operator named "PadResize224" is added to the data_process sub search space and will be searched in the following process.
|
||
|
||
### Search
|
||
|
||
Similar to the image retrieval pipeline, combinations search includes two stages: search for feature extraction and search for indexing.
|
||
|
||
#### search for feature extraction
|
||
|
||
Search for the feature extraction combinations by:
|
||
|
||
```shell
|
||
python3 search_extract.py [-sp ${save_path}] [-sp ${search_modules}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `save_path`: path for saving the output features in pickle format.
|
||
- `search_modules`: name of the folder containing search space files.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 search_extract.py -sp /data/features/gap_gmp_gem_crow_spoc/ -sm search_modules
|
||
```
|
||
|
||
#### search for indexing
|
||
|
||
Search for the indexing combinations by:
|
||
|
||
```shell
|
||
python3 search_index.py [-fd ${fea_dir}] [-sm ${search_modules}] [-sp ${save_path}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `fea_dir`: path of the output features extracted by the feature extraction combinations search.
|
||
- `search_modules`: name of the folder containing search space files.
|
||
- `save_path`: path for saving the retrieval results of each combination.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 search_index.py -fd /data/features/gap_gmp_gem_crow_spoc/ -sm search_modules -sp /data/features/gap_gmp_gem_crow_spoc_result.json
|
||
```
|
||
|
||
#### show search results
|
||
|
||
We provide two ways to show the search results. One is save all the search results in a csv format file, which can be used for further analyses. The other is showing the search results according to the given keywords. You can define the keywords as follows:
|
||
|
||
```sh
|
||
keywords = {
|
||
'data_name': ['market'],
|
||
'pre_process_name': list(),
|
||
'model_name': list(),
|
||
'feature_map_name': list(),
|
||
'aggregator_name': list(),
|
||
'post_process_name': ['no_fea_process', 'l2_normalize', 'pca_whiten', 'pca_wo_whiten'],
|
||
}
|
||
```
|
||
|
||
See show_search_results.py for more details.
|
||
|