mirror of https://github.com/PyRetri/PyRetri.git
338 lines
9.4 KiB
Markdown
338 lines
9.4 KiB
Markdown
# Getting started
|
||
|
||
This page provides basic tutorials about the usage of PyRetri. For installation instructions and dataset preparation, please see [INSTALL.md](../docs/INSTALL.md).
|
||
|
||
## Make Data Json
|
||
|
||
After the gallery set and query set are separated, we package the information of each sub-dataset in pickle format for further process. We use different types to package different structured folders: `general`, `oxford` and `reid`.
|
||
|
||
The general object recognition dataset collects images with the same label in one directory and the folder structure should be like this:
|
||
|
||
```shell
|
||
# type: general
|
||
general_recognition
|
||
├── class A
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
├── class B
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
└── ···
|
||
```
|
||
|
||
Oxford5k is a typical dataset in image retrieval field and the folder structure is as follows:
|
||
|
||
```shell
|
||
# type: oxford
|
||
oxford
|
||
├── gt
|
||
│ ├── XXX.txt
|
||
│ └── ···
|
||
└── images
|
||
├── XXX.jpg
|
||
└── ···
|
||
|
||
```
|
||
|
||
The person re-identification dataset have already split the query set and gallery set, its folder structure should be like this:
|
||
|
||
```shell
|
||
# type: reid
|
||
person_re_identification
|
||
├── bounding_box_test
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
├── query
|
||
│ ├── XXX.jpg
|
||
│ └── ···
|
||
└── ···
|
||
```
|
||
|
||
Choosing the mode carefully, you can generate data jsons by:
|
||
|
||
```shell
|
||
python3 main/make_data_json.py [-d ${dataset}] [-sp ${save_path}] [-t ${type}] [-gt ${ground_truth}]
|
||
```
|
||
|
||
Auguments:
|
||
|
||
- `data`: Path of the dataset for generating data json file.
|
||
- `save_path`: Path for saving the output file.
|
||
- `type`: Type of the dataset collecting images. For dataset collecting images with the same label in one directory, we use `general`. For oxford dataset, we use `oxford`. For re-id dataset, we use `reid`.
|
||
- `ground_truth`: Optional. Path of the gt information, which is necessary for generating data json file of oxford dataset.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
# for dataset collecting images with the same label in one directory
|
||
python3 main/make_data_json.py -d /data/caltech101/gallery/ -sp data_jsons/caltech_gallery.json -t general
|
||
|
||
python3 main/make_data_json.py -d /data/caltech101/query/ -sp data_jsons/caltech_query.json -t general
|
||
|
||
# for oxford dataset
|
||
python3 main/make_data_json.py -d /data/cbir/oxford/gallery/ -sp data_jsons/oxford_gallery.json -t oxford -gt /data/cbir/oxford/gt/
|
||
|
||
python3 main/make_data_json.py -d /data/cbir/oxford/query/ -sp data_jsons/oxford_query.json -t oxford -gt /data/cbir/oxford/gt/
|
||
|
||
# for re-id dataset
|
||
python3 main/make_data_json.py -d /data/market1501/bounding_box_test/ -sp data_jsons/market_gallery.json -t reid
|
||
|
||
python3 main/make_data_json.py -d /data/market1501/query/ -sp data_jsons/market_query.json -t reid
|
||
```
|
||
|
||
Note: Oxford dataset contains the ground truth of each query image in a txt file, so remember to give the path of gt file when generating data json file of Oxford.
|
||
|
||
## Extract
|
||
|
||
All outputs (features and labels) will be saved to the target directory in pickle format.
|
||
|
||
Extract feature for each data json file by:
|
||
|
||
```shell
|
||
python3 main/extract_feature.py [-dj ${data_json}] [-sp ${save_path}] [-cfg ${config_file}] [-si ${save_interval}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `data_json`: Path of the data json file to be extrated.
|
||
- `save_path`: Path for saving the output features in pickle format.
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
- `save_interval`: Optional. It is the number of features saved in one part file, which is set to 5000 by default.
|
||
|
||
```shell
|
||
# extract features of gallert set and query set
|
||
python3 main/extract_feature.py -dj data_jsons/caltech_gallery.json -sp /data/features/caltech/gallery/ -cfg configs/caltech.yaml
|
||
|
||
python3 main/extract_feature.py -dj data_jsons/caltech_query.json -sp /data/features/caltech/query/ -cfg configs/caltech.yaml
|
||
```
|
||
|
||
## Index
|
||
|
||
The path of query set features and gallery set features is specified in the config file.
|
||
|
||
After extracting gallery set features and query set features, you can index the query set features by:
|
||
|
||
```shell
|
||
python3 main/index.py [-cfg ${config_file}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 main/index.py -cfg configs/caltech.yaml
|
||
```
|
||
|
||
## Single Image Index
|
||
|
||
For visulization results and wrong case analysis, we provide the script for single query image and you can visualize or save the retrieval results easily.
|
||
|
||
Use this command to single image index:
|
||
|
||
```shell
|
||
python3 main/single_index.py [-cfg ${config_file}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `config_file`: Path of the configuration file in yaml format.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 main/single_index.py -cfg configs/caltech.yaml
|
||
```
|
||
|
||
Please see [single_index.py](../main/single_index.py) for more details.
|
||
|
||
## Add Your Own Module
|
||
|
||
We basically categorize retrieval process into 4 components.
|
||
|
||
- model: the pre-trained model for feature extraction.
|
||
- extract: assign which layer to output, including splitter functions and aggregation methods.
|
||
- index: index features, including dimension process, feature enhance, distance metric and re-rank.
|
||
- evaluate: evaluate retrieval results, outputting recall and mAP results.
|
||
|
||
Here we show how to add your own model to extract features.
|
||
|
||
1. Create your model file `pyretri/models/backbone/backbone_impl/reid_baseline.py`.
|
||
|
||
```shell
|
||
import torch.nn as nn
|
||
|
||
from ..backbone_base import BackboneBase
|
||
from ...registry import BACKBONES
|
||
|
||
@BACKBONES.register
|
||
class ft_net(BackboneBase):
|
||
def __init__(self):
|
||
pass
|
||
|
||
def forward(self, x):
|
||
pass
|
||
```
|
||
|
||
or
|
||
|
||
```shell
|
||
import torch.nn as nn
|
||
|
||
from ..backbone_base import BackboneBase
|
||
from ...registry import BACKBONES
|
||
|
||
class FT_NET(BackboneBase):
|
||
def __init__(self):
|
||
pass
|
||
|
||
def forward(self, x):
|
||
pass
|
||
|
||
@BACKBONES.register
|
||
def ft_net():
|
||
model = FT_NET()
|
||
return model
|
||
```
|
||
|
||
2. Import the module in `pyretri/models/backbone/__init__.py`.
|
||
|
||
```shell
|
||
from .backbone_impl.reid_baseline import ft_net
|
||
|
||
__all__ = [
|
||
'ft_net',
|
||
]
|
||
```
|
||
|
||
3. Use it in your config file.
|
||
|
||
```shell
|
||
model:
|
||
name: "ft_net"
|
||
ft_net:
|
||
load_checkpoint: "/data/my_model_zoo/res50_market1501.pth"
|
||
```
|
||
|
||
## Pipeline Combinations Search
|
||
|
||
Since tricks used in each stage have a signicant impact on retrieval performance, we present the pipeline combinations search scripts to help users to find possible combinations of approaches with various hyper-parameters.
|
||
|
||
### Get into the combinations search scripts
|
||
|
||
```shell
|
||
cd search/
|
||
```
|
||
|
||
### Define Search Space
|
||
|
||
We decompose the search space into three sub search spaces: pre_process, extract and index, each of which corresponds to a specified file. Search space is defined by adding methods with hyper-parameters to a specified dict. You can add a search operator as follows:
|
||
|
||
```shell
|
||
pre_processes.add(
|
||
"PadResize224",
|
||
{
|
||
"batch_size": 32,
|
||
"folder": {
|
||
"name": "Folder"
|
||
},
|
||
"collate_fn": {
|
||
"name": "CollateFn"
|
||
},
|
||
"transformers": {
|
||
"names": ["PadResize", "ToTensor", "Normalize"],
|
||
"PadResize": {
|
||
"size": 224,
|
||
"padding_v": [124, 116, 104]
|
||
},
|
||
"Normalize": {
|
||
"mean": [0.485, 0.456, 0.406],
|
||
"std": [0.229, 0.224, 0.225]
|
||
}
|
||
}
|
||
}
|
||
)
|
||
```
|
||
|
||
By doing this, a pre_process operator named "PadResize224" is added to the data_process sub search space and will be searched in the following process.
|
||
|
||
### Search
|
||
|
||
Similar to the image retrieval pipeline, combinations search includes two stages: search for feature extraction and search for indexing.
|
||
|
||
#### search for feature extraction
|
||
|
||
Search for the feature extraction combinations by:
|
||
|
||
```shell
|
||
python3 search_extract.py [-sp ${save_path}] [-sp ${search_modules}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `save_path`: path for saving the output features in pickle format.
|
||
- `search_modules`: name of the folder containing search space files.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 search_extract.py -sp /data/features/gap_gmp_gem_crow_spoc/ -sm search_modules
|
||
```
|
||
|
||
#### search for indexing
|
||
|
||
Search for the indexing combinations by:
|
||
|
||
```shell
|
||
python3 search_index.py [-fd ${fea_dir}] [-sm ${search_modules}] [-sp ${save_path}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `fea_dir`: path of the output features extracted by the feature extraction combinations search.
|
||
- `search_modules`: name of the folder containing search space files.
|
||
- `save_path`: path for saving the retrieval results of each combination.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
python3 search_index.py -fd /data/features/gap_gmp_gem_crow_spoc/ -sm search_modules -sp /data/features/gap_gmp_gem_crow_spoc_result.json
|
||
```
|
||
|
||
#### show search results
|
||
|
||
We provide two ways to show the search results. One is to save all the search results in a csv format file, which can be used for further analyses. Another is to show the search results according to the given keywords. You can define the keywords as follows:
|
||
|
||
```sh
|
||
keywords = {
|
||
'data_name': ['market'],
|
||
'pre_process_name': list(),
|
||
'model_name': list(),
|
||
'feature_map_name': list(),
|
||
'aggregator_name': list(),
|
||
'post_process_name': ['no_fea_process', 'l2_normalize', 'pca_whiten', 'pca_wo_whiten'],
|
||
}
|
||
```
|
||
|
||
Show the search results by:
|
||
|
||
```shell
|
||
show_search_results.py [-r ${result_json_path}]
|
||
```
|
||
|
||
Arguments:
|
||
|
||
- `result_json_path`: path of the result json file.
|
||
|
||
Examples:
|
||
|
||
```shell
|
||
show_search_results.py -r /data/features/gap_gmp_gem_crow_spoc_result.json
|
||
```
|
||
|
||
See [show_search_results.py](../search/show_search_results.py) for more details.
|
||
|