# Useful tools
We provide lots of useful tools under the `tools/` directory. In addition, you can also quickly run other open source libraries of OpenMMLab through MIM.
Take MMDetection as an example. If you want to use [print_config.py](https://github.com/open-mmlab/mmdetection/blob/3.x/tools/misc/print_config.py), you can directly use the following commands without copying the source code to the MMYOLO library.
```shell
mim run mmdet print_config [CONFIG]
```
**Note**: The MMDetection library must be installed through the MIM before the above command can succeed.
## Visualization
### Visualize COCO labels
`tools/analysis_tools/browse_coco_json.py` is a script that can visualization to display the COCO label in the picture.
```shell
python tools/analysis_tools/browse_coco_json.py ${DATA_ROOT} \
[--ann_file ${ANN_FILE}] \
[--img_dir ${IMG_DIR}] \
[--wait-time ${WAIT_TIME}] \
[--disp-all] [--category-names CATEGORY_NAMES [CATEGORY_NAMES ...]] \
[--shuffle]
```
E.g:
1. Visualize all categories of `COCO` and display all types of annotations such as `bbox` and `mask`:
```shell
python tools/analysis_tools/browse_coco_json.py './data/coco/' \
--ann_file 'annotations/instances_train2017.json' \
--img_dir 'train2017' \
--disp-all
```
2. Visualize all categories of `COCO`, and display only the `bbox` type labels, and shuffle the image to show:
```shell
python tools/analysis_tools/browse_coco_json.py './data/coco/' \
--ann_file 'annotations/instances_train2017.json' \
--img_dir 'train2017' \
--shuffle
```
3. Only visualize the `bicycle` and `person` categories of `COCO` and only the `bbox` type labels are displayed:
```shell
python tools/analysis_tools/browse_coco_json.py './data/coco/' \
--ann_file 'annotations/instances_train2017.json' \
--img_dir 'train2017' \
--category-names 'bicycle' 'person'
```
4. Visualize all categories of `COCO`, and display all types of label such as `bbox`, `mask`, and shuffle the image to show:
```shell
python tools/analysis_tools/browse_coco_json.py './data/coco/' \
--ann_file 'annotations/instances_train2017.json' \
--img_dir 'train2017' \
--disp-all \
--shuffle
```
### Visualize Datasets
`tools/analysis_tools/browse_dataset.py` helps the user to browse a detection dataset (both images and bounding box annotations) visually, or save the image to a designated directory.
```shell
python tools/analysis_tools/browse_dataset.py ${CONFIG} \
[-h] \
[--output-dir ${OUTPUT_DIR}] \
[--not-show] \
[--show-interval ${SHOW_INTERVAL}]
```
E,g:
1. Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` to visualize the picture. The picture will pop up directly and be saved to the directory `work dir/browse_ dataset` at the same time:
```shell
python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py' \
--output-dir 'work-dir/browse_dataset'
```
2. Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` to visualize the picture. The picture will pop up and display directly. Each picture lasts for `10` seconds. At the same time, it will be saved to the directory `work dir/browse_ dataset`:
```shell
python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py' \
--output-dir 'work-dir/browse_dataset' \
--show-interval 10
```
3. Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` to visualize the picture. The picture will pop up and display directly. Each picture lasts for `10` seconds and the picture will not be saved:
```shell
python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py' \
--show-interval 10
```
4. Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` to visualize the picture. The picture will not pop up directly, but only saved to the directory `work dir/browse_ dataset`:
```shell
python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py' \
--output-dir 'work-dir/browse_dataset' \
--not-show
```
### Visualize dataset analysis
`tools/analysis_tools/dataset_analysis.py` help users get the renderings of the four functions, and save the pictures to the `dataset_analysis` folder under the current running directory.
Description of the script's functions:
The data required by each sub function is obtained through the data preparation of `main()`.
Function 1: Generated by the sub function `show_bbox_num` to display the distribution of categories and bbox instances.
Function 2: Generated by the sub function `show_bbox_wh` to display the width and height distribution of categories and bbox instances.
Function 3: Generated by the sub function `show_bbox_wh_ratio` to display the width to height ratio distribution of categories and bbox instances.
Function 3: Generated by the sub function `show_bbox_area` to display the distribution map of category and bbox instance area based on area rules.
Print List: Generated by the sub function `show_class_list` and `show_data_list`.
```shell
python tools/analysis_tools/dataset_analysis.py ${CONFIG} \
[-h] \
[--type ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--output-dir ${OUTPUT_DIR}]
```
E,g:
1.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, By default,the data loadingt type is `train_dataset`, the area rule is `[0,32,96,1e5]`, generate a result graph containing all functions and save the graph to the current running directory `./dataset_analysis` folder:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py
```
2.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, change the data loading type from the default `train_dataset` to `val_dataset` through the `--val-dataset` setting:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--val-dataset
```
3.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, change the display of all generated classes to specific classes. Take the display of `person` classes as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--class-name person
```
4.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, redefine the area rule through `--area-rule` . Take `30 70 125` as an example, the area rule becomes `[0,30,70,125,1e5]`:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--area-rule 30 70 125
```
5.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, change the display of four function renderings to only display `Function 1` as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--func show_bbox_num
```
6.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, modify the picture saving address to `work_ir/dataset_analysis`:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--output-dir work_dir/dataset_analysis
```
## Dataset Conversion
The folder `tools/data_converters` currently contains `ballon2coco.py` and `yolo2coco.py` two dataset conversion tools.
- `ballon2coco.py` converts the `balloon` dataset (this small dataset is for starters only) to COCO format.
For a detailed description of this script, please see the `Dataset Preparation` section in [From getting started to deployment with YOLOv5](./yolov5_tutorial.md).
```shell
python tools/dataset_converters/balloon2coco.py
```
- `yolo2coco.py` converts a dataset from `yolo-style` **.txt** format to COCO format, please use it as follows:
```shell
python tools/dataset_converters/yolo2coco.py /path/to/the/root/dir/of/your_dataset
```
Instructions:
1. `image_dir` is the root directory of the yolo-style dataset you need to pass to the script, which should contain `images`, `labels`, and `classes.txt`. `classes.txt` is the class declaration corresponding to the current dataset. One class a line. The structure of the root directory should be formatted as this example shows:
```bash
.
└── $ROOT_PATH
├── classes.txt
├── labels
│ ├── a.txt
│ ├── b.txt
│ └── ...
├── images
│ ├── a.jpg
│ ├── b.png
│ └── ...
└── ...
```
2. The script will automatically check if `train.txt`, `val.txt`, and `test.txt` have already existed under `image_dir`. If these files are located, the script will organize the dataset accordingly. Otherwise, the script will convert the dataset into one file. The image paths in these files must be **ABSOLUTE** paths.
3. By default, the script will create a folder called `annotations` in the `image_dir` directory which stores the converted JSON file. If `train.txt`, `val.txt`, and `test.txt` are not found, the output file is `result.json`. Otherwise, the corresponding JSON file will be generated, named as `train.json`, `val.json`, and `test.json`. The `annotations` folder may look similar to this:
```bash
.
└── $ROOT_PATH
├── annotations
│ ├── result.json
│ └── ...
├── classes.txt
├── labels
│ ├── a.txt
│ ├── b.txt
│ └── ...
├── images
│ ├── a.jpg
│ ├── b.png
│ └── ...
└── ...
```
## Download Dataset
`tools/misc/download_dataset.py` supports downloading datasets such as `COCO`, `VOC`, `LVIS` and `Balloon`.
```shell
python tools/misc/download_dataset.py --dataset-name coco2017
python tools/misc/download_dataset.py --dataset-name voc2007
python tools/misc/download_dataset.py --dataset-name voc2012
python tools/misc/download_dataset.py --dataset-name lvis
python tools/misc/download_dataset.py --dataset-name balloon [--save-dir ${SAVE_DIR}] [--unzip]
```
## Convert Model
The six scripts under the `tools/model_converters` directory can help users convert the keys in the official pre-trained model of YOLO to the format of MMYOLO, and use MMYOLO to fine-tune the model.
### YOLOv5
Take conversion `yolov5s.pt` as an example:
1. Clone the official YOLOv5 code to the local (currently the maximum supported version is `v6.1`):
```shell
git clone -b v6.1 https://github.com/ultralytics/yolov5.git
cd yolov5
```
2. Download official weight file:
```shell
wget https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt
```
3. Copy file `tools/model_converters/yolov5_to_mmyolo.py` to the path of YOLOv5 official code clone:
```shell
cp ${MMDET_YOLO_PATH}/tools/model_converters/yolov5_to_mmyolo.py yolov5_to_mmyolo.py
```
4. Conversion
```shell
python yolov5_to_mmyolo.py --src ${WEIGHT_FILE_PATH} --dst mmyolov5.pt
```
The converted `mmyolov5.pt` can be used by MMYOLO. The official weight conversion of YOLOv6 is also used in the same way.
### YOLOX
The conversion of YOLOX model **does not need** to download the official YOLOX code, just download the weight.
Take conversion `yolox_s.pth` as an example:
1. Download official weight file:
```shell
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth
```
2. Conversion
```shell
python tools/model_converters/yolox_to_mmyolo.py --src yolox_s.pth --dst mmyolox.pt
```
The converted `mmyolox.pt` can be used by MMYOLO.
## optimize anchors size
script `tools/analysis_tools/optimize_anchors.py` supports three methods to optimize YOLO anchors including `k-means`
anchor cluster, `differential_evolution` and `v5-k-means`.
### k-means
In k-means method, the distance criteria is based IoU, python shell as follow:
```shell
python tools/analysis_tools/optimize_anchors.py ${CONFIG} \
--algorithm k-means \
--input-shape ${INPUT_SHAPE [WIDTH HEIGHT]} \
--output-dir ${OUTPUT_DIR}
```
### differential_evolution
In differential_evolution method, based differential evolution algorithm, use `avg_iou_cost` as minimum target function, python shell as follow:
```shell
python tools/analysis_tools/optimize_anchors.py ${CONFIG} \
--algorithm differential_evolution \
--input-shape ${INPUT_SHAPE [WIDTH HEIGHT]} \
--output-dir ${OUTPUT_DIR}
```
### v5-k-means
In v5-k-means method, clustering standard as same with yolov5 which use shape-match, python shell as follow:
```shell
python tools/analysis_tools/optimize_anchors.py ${CONFIG} \
--algorithm v5-k-means \
--input-shape ${INPUT_SHAPE [WIDTH HEIGHT]} \
--prior_match_thr ${PRIOR_MATCH_THR} \
--output-dir ${OUTPUT_DIR}
```
## Extracts a subset of COCO
The training dataset of the COCO2017 dataset includes 118K images, and the validation set includes 5K images, which is a relatively large dataset. Loading JSON in debugging or quick verification scenarios will consume more resources and bring slower startup speed.
The `extract_subcoco.py` script provides the ability to extract a specified number of images. The user can use the `--num-img` parameter to get a COCO subset of the specified number of images.
Currently, only support COCO2017. In the future will support user-defined datasets of standard coco JSON format.
The root path folder format is as follows:
```text
├── root
│ ├── annotations
│ ├── train2017
│ ├── val2017
│ ├── test2017
```
1. Extract 10 training images and 10 validation images using only 5K validation sets.
```shell
python tools/misc/extract_subcoco.py ${ROOT} ${OUT_DIR} --num-img 10
```
2. Extract 20 training images using the training set and 20 validation images using the validation set.
```shell
python tools/misc/extract_subcoco.py ${ROOT} ${OUT_DIR} --num-img 20 --use-training-set
```
3. Set the global seed to 1. The default is no setting.
```shell
python tools/misc/extract_subcoco.py ${ROOT} ${OUT_DIR} --num-img 20 --use-training-set --seed 1
```