This page provides basic tutorials about the usage of PyRetri. For installation instructions and dataset preparation, please see [INSTALL.md] (INSTALL.md).
## Make Data Json
After the gallery set and query set are separated, we package the information of each subset in pickle format for further process. We use different types to package different structured folders: `general`, `oxford` and `reid`.
The general object recognition dataset collects images with the same label in one directory and the folder structure should be like this:
-`data`: Path of the dataset for generating data json file.
-`save_path`: Path for saving the output file.
-`type`: Type of the dataset collecting images. For dataset collecting images with the same label in one directory, we use `general`. For oxford/paris dataset, we use `oxford`. For re-id dataset, we use `reid`.
-`ground_truth`: Path of the gt information, which is necessary for generating data json file of oxford/paris dataset.
Examples:
```shell
# for dataset collecting images with the same label in one directory
python3 main/make_data_json.py -d /data/caltech101/gallery/ -sp data_jsons/caltech_gallery.json -t general
Note: Oxford/Paris dataset contains the ground truth of each query image in a txt file, so remember to give the path of gt file when generating data json file of Oxford/Paris.
## Extract
All outputs (features and labels) will be saved to the save directory in pickle format.
The path of query set features and gallery set features is specified in the config file.
Index the query set features by:
```shell
python3 main/index.py [-cfg ${config_file}]
```
Arguments:
-`config_file`: Path of the configuration file in yaml format.
Examples:
```shell
python3 main/index.py -cfg configs/caltech.yaml
```
## Single Index
For visulization results and wrong case analysis, we provide the script for single query image and you can visualize or save the retrieval results easily.
Since tricks used in each stage have a signicant impact on retrieval performance, we present the pipeline combinations search scripts to help users to find possible combinations of approaches with various hyper-parameters.
### Get into the combinations search scripts
```shell
cd search/
```
### Define Search Space
We decompose the search space into three sub search spaces: data_process, extract and index, each of which corresponds to a specified file. Search space is defined by adding methods with hyper-parameters to a specified dict. You can add a search operator as follows:
```shell
data_processes.add(
"PadResize224",
{
"batch_size": 32,
"folder": {
"name": "Folder"
},
"collate_fn": {
"name": "CollateFn"
},
"transformers": {
"names": ["PadResize", "ToTensor", "Normalize"],
"PadResize": {
"size": 224,
"padding_v": [124, 116, 104]
},
"Normalize": {
"mean": [0.485, 0.456, 0.406],
"std": [0.229, 0.224, 0.225]
}
}
}
)
```
By doing this, a data_process operator named "PadResize224" is added to the data_process sub search space and will be searched in the following process.
### Search
Similar to the image retrieval pipeline, combinations search includes two stages: search for feature extraction and search for indexing.
#### search for feature extraction
Search for the feature extraction combinations by: