In our daily work and study, we often encounter some tasks that need to train custom dataset. There are few scenarios in which open-source datasets can be used as online models, so we need to carry out a series of operations on our custom datasets to ensure that the models can be put into production and serve users.
```{SeeAlso}
The video of this document has been posted on Bilibili: [A nanny level tutorials for custom datasets from annotationt to deployment](https://www.bilibili.com/video/BV1RG4y137i5)
```
```{Note}
All instructions in this document are done on Linux and are fully available on Windows, only slightly different in commands and operations.
```
Default that you have completed the installation of MMYOLO, if not installed, please refer to the document [GET STARTED](https://mmyolo.readthedocs.io/en/latest/get_started.html) for installation.
In this tutorial, we will introduce the whole process from annotating custom dataset to final training, testing and deployment. The overview steps are as below:
08. Visualization the data processing part of config: `tools/analysis_tools/browse_dataset.py`
09. Train: `tools/train.py`
10. Inference: `demo/image_demo.py`
11. Deployment
```{Note}
After obtaining the model weight and the mAP of validation set, users need to deep analyse the bad cases of incorrect predictions in order to optimize model. MMYOLO will add this function in the future. Expect.
```
Each step is described in detail below.
## 1. Prepare custom dataset
- If you don't have your own dataset, or want to use a small dataset to run the whole process, you can use the 144 images `cat` dataset provided with this tutorial (the raw picture of this dataset is supplied by @RangeKing, cleaned by @PeterH0323). This `cat` dataset will be used as an example for the rest tutorial.
This dataset is automatically downloaded to the `./data/cat` dir with the following directory structure:
```shell
.
└── ./data/cat
├── images # image files
│ ├── image1.jpg
│ ├── image2.png
│ └── ...
├── labels # labelme files
│ ├── image1.json
│ ├── image2.json
│ └── ...
├── annotations # annotated files of COCO
│ ├── annotations_all.json # all labels of COCO
│ ├── trainval.json # 80% labels of the dataset
│ └── test.json # 20% labels of the dataset
└── class_with_id.txt # id + class_name file
```
This dataset can be trained directly. You can remove everything **outside** the `images` dir if you want to go through the whole process.
- If you already have a dataset, you can compose it into the following structure:
```shell
.
└── $DATA_ROOT
└── images
├── image1.jpg
├── image2.png
└── ...
```
## 2. Use the software of labelme to annotate
In general, there are two annotation methods:
- Software or algorithmic assistance + manual correction (Recommend, reduce costs and speed up)
- Only manual annotation
```{Note}
At present, we also consider to access third-party libraries to support the integration of algorithm-assisted annotation and manual optimized annotation by calling MMYOLO inference API through GUI interface.
If you have any interest or ideas, please leave a comment in the issue or contact us directly!
```
### 2.1 Software or algorithmic assistance + manual correction
The principle is using the existing model to inference, and save the result as label file. Manually operating the software and loading the generated label files, you only need to check whether each image is correctly labeled and whether there are missing objects.【assistance + manual correction】you can save a lot of time in order to **reduce costs and speed up** by this way.
```{Note}
If the existing model doesn't have the categories defined in your dataset, such as COCO pre-trained model, you can manually annotate 100 images to train an initial model, and then software assistance.
```
The process is described below:
#### 2.1.1 Software or algorithmic assistance
MMYOLO provide model inference script `demo/image_demo.py`. Setting `--to-labelme` to generate labelme format label file:
```shell
python demo/image_demo.py img \
config \
checkpoint
[--out-dir OUT_DIR] \
[--device DEVICE] \
[--show] \
[--deploy] \
[--score-thr SCORE_THR] \
[--class-name CLASS_NAME]
[--to-labelme]
```
These include:
-`img`: image path, supported by dir, file, URL;
-`config`:config file path of model;
-`checkpoint`:weight file path of model;
-`--out-dir`:inference results saved in this dir, default as `./output`, if set this `--show` parameter, the detection results are not saved;
-`--device`:cumputing resources, including `CUDA`, `CPU` etc., default as `cuda:0`;
-`--show`:display the detection results, default as `False`;
-`--deploy`:whether to switch to deploy mode;
-`--score-thr`:confidence threshold, default as `0.3`;
-`--to-labelme`:whether to export label files in `labelme` format, shouldn't exist with the `--show` at the same time.
For example:
Here, we'll use YOLOv5-s as an example to help us label the 'cat' dataset we just downloaded. First, download the weights for YOLOv5-s:
Type in command and labelme will start, and then check label. If labelme fails to start, type `export QT_DEBUG_PLUGINS=1` in command to see which libraries are missing and install it.
The procedure is the same as 【2.1.2 Manual annotation】, except that this is a direct labeling, there is no pre-generated label.
## 3. Convert the dataset into COCO format
### 3.1 Using scripts to convert
MMYOLO provides scripts to convert labelme labels to COCO labels
```shell
python tools/dataset_converters/labelme2coco.py --img-dir ${image dir path} \
--labels-dir ${label dir location} \
--out ${output COCO label json path} \
[--class-id-txt ${class_with_id.txt path}]
```
These include:
`--class-id-txt`: is the `.txt` file of `id class_name` dataset:
- If not specified, the script will be generated automatically in the same directory as `--out`, and save it as `class_with_id.txt`;
- If specified, the script will read but not add or overwrite. It will also check if there are any other classes in the `.txt` file and will give you an error if there are any. Please check the `.txt` file and add the new class and its `id`.
An example `.txt` file looks like this (`id` start at `1`, just like COCO):
For the `cat` dataset in this demo (note that we don't need to include the background class), we can see that the generated `class_with_id.txt` has only the `1` class:
```text
1 cat
```
### 3.2 Check the converted COCO label
Using the following command, we can display the COCO label on the image, which will verify that there are no problems with the conversion:
```shell
python tools/analysis_tools/browse_coco_json.py --img-dir ${image dir path} \
See [Visualizing COCO label](https://mmyolo.readthedocs.io/en/latest/user_guides/useful_tools.html#coco) for more information on `tools/analysis_tools/browse_coco_json.py`.
```
## 4. Divide dataset into training set, validation set and test set
Usually, custom dataset is a large folder with full of images. We need to divide the dataset into training set, validation set and test set by ourselves. If the amount of data is small, we can not divide the validation set. Here's how the split script works:
-`--ratios`: ratio of division. If only 2 are set, the split is `trainval + test`, and if 3 are set, the split is `train + val + test`. Two formats are supported - integer and decimal:
- Integer: divide the dataset in proportion after normalization. Example: `--ratio 2 1 1` (the code will convert to `0.5 0.25 0.25`) or `--ratio 3 1`(the code will convert to `0.75 0.25`)
- Decimal: divide the dataset in proportion. **If the sum does not add up to 1, the script performs an automatic normalization correction.** Example: `--ratio 0.8 0.1 0.1` or `--ratio 0.8 0.2`
-`--shuffle`: whether to shuffle the dataset before splitting.
-`--seed`: the random seed of dataset division. If not set, this will be generated automatically.
## 5. Create a new config file based on the dataset
Make sure the dataset directory looks like this:
```shell
.
└── $DATA_ROOT
├── annotations
│ ├── trainval.json # only divide into trainval + test according to the above commands; If you use 3 groups to divide the ratio, here is train.json、val.json、test.json
│ └── test.json
├── images
│ ├── image1.jpg
│ ├── image1.png
│ └── ...
└── ...
```
Since this is custom dataset, we need to create a new config and add some information we want to change.
About naming the new config:
- This config inherits from `yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py`;
- We will train the class `cat` from the dataset provided with this tutorial (if you are using your own dataset, you can define the class name of your own dataset);
- The GPU tested in this tutorial is 1 x 3080Ti with 12G video memory, and the computer memory is 32G. The maximum batch size for YOLOv5-s training is `batch size = 32` (see the Appendix for detailed machine information);
- Training epoch is `100 epoch`.
To sum up: you can name it `yolov5_s-v61_syncbn_fast_1xb32-100e_cat.py` and place it into the dir of `configs/custom_dataset`.
Create a new directory named `custom_dataset` inside configs dir, and add config file with the following content:
# load_from can specify a local path or URL, setting the URL will automatically download, because the above has been downloaded, we set the local path here
# since this tutorial is fine-tuning on the cat dataset, we need to use `load_from` to load the pre-trained model from MMYOLO. This allows for faster convergence and accuracy
# according to your GPU situation, modify the batch size, and YOLOv5-s defaults to 8 cards x 16bs
train_batch_size_per_gpu = 32
train_num_workers = 4 # recommend to use train_num_workers = nGPU x 4
save_epoch_intervals = 2 # save weights every interval round
# according to your GPU situation, modify the base_lr, modification ratio is base_lr_default * (your_bs / default_bs)
base_lr = _base_.base_lr / 4
anchors = [ # the anchor has been updated according to the characteristics of dataset. The generation of anchor will be explained in the following section.
[(68, 69), (154, 91), (143, 162)], # P3/8
[(242, 160), (189, 287), (391, 207)], # P4/16
[(353, 337), (539, 341), (443, 432)] # P5/32
]
class_name = ('cat', ) # according to the label information of class_with_id.txt, set the class_name
num_classes = len(class_name)
metainfo = dict(
classes=class_name,
palette=[(220, 20, 60)] # the color of drawing, free to set
)
train_cfg = dict(
max_epochs=max_epochs,
val_begin=20, # number of epochs to start validation. Here 20 is set because the accuracy of the first 20 epochs is not high and the test is not meaningful, so it is skipped
val_interval=save_epoch_intervals # the test evaluation is performed iteratively every val_interval round
)
model = dict(
bbox_head=dict(
head_module=dict(num_classes=num_classes),
prior_generator=dict(base_sizes=anchors),
# loss_cls is dynamically adjusted based on num_classes, but when num_classes = 1, loss_cls is always 0
loss_cls=dict(loss_weight=0.5 *
(num_classes / 80 * 3 / _base_.num_det_layers))))
train_dataloader = dict(
batch_size=train_batch_size_per_gpu,
num_workers=train_num_workers,
dataset=dict(
_delete_=True,
type='RepeatDataset',
# if the dataset is too small, you can use RepeatDataset, which repeats the current dataset n times per epoch, where 5 is set.
# set how many epochs to save the model, and the maximum number of models to save,`save_best` is also the best model (recommended).
checkpoint=dict(
type='CheckpointHook',
interval=save_epoch_intervals,
max_keep_ckpts=5,
save_best='auto'),
param_scheduler=dict(max_epochs=max_epochs),
# logger output interval
logger=dict(type='LoggerHook', interval=10))
```
```{Note}
We put an identical config file in `projects/misc/custom_dataset/yolov5_s-v61_syncbn_fast_1xb32-100e_cat.py`. You can choose to copy to `configs/custom_dataset/yolov5_s-v61_syncbn_fast_1xb32-100e_cat.py` to start training directly.
```
## 6. Visual analysis of datasets
The script `tools/analysis_tools/dataset_analysis.py` will helo you get a plot of your dataset. The script can generate four types of analysis graphs:
- A distribution plot showing categories and the number of bbox instances: `show_bbox_num`
- A distribution plot showing categories and the width and height of bbox instances: `show_bbox_wh`
- A distribution plot showing categories and the width/height ratio of bbox instances: `show_bbox_wh_ratio`
- A distribution plot showing categories and the area of bbox instances based on the area rule: `show_bbox_area`
Due to the cat dataset used in this tutorial is relatively small, we use RepeatDataset in config. The numbers shown are actually repeated five times. If you want a repeat-free analysis, you can change the `times` argument in RepeatDataset from `5` to `1` for now.
```
From the analysis output, we can conclude that the training set of the `cat` dataset used in this tutorial has the following characteristics:
- The images are all `large object`;
- The number of categories cat is `655`;
- The width and height ratio of bbox is mostly concentrated in `1.0 ~ 1.11`, the minimum ratio is `0.36` and the maximum ratio is `2.9`;
- The width of bbox is about `500 ~ 600` , and the height is about `500 ~ 600`.
```{SeeAlso}
See [Visualizing Dataset Analysis](https://mmyolo.readthedocs.io/en/latest/user_guides/useful_tools.html#id4) for more information on `tools/analysis_tools/dataset_analysis.py`
```
## 7. Optimize Anchor size
```{Warning}
This step only works for anchor-base models such as YOLOv5;
This step can be skipped for Anchor-free models, such as YOLOv6, YOLOX.
```
The `tools/analysis_tools/optimize_anchors.py` script supports three anchor generation methods from YOLO series: `k-means`, `Differential Evolution` and `v5-k-means`.
In this tutorial, we will use YOLOv5 for training, with an input size of `640 x 640`, and `v5-k-means` to optimize anchor:
Because this command uses the k-means clustering algorithm, there is some randomness, which is related to the initialization. Therefore, the Anchor obtained by each execution will be somewhat different, but it is generated based on the dataset passed in, so it will not have any adverse effects.
See [Optimize Anchor Sizes](https://mmyolo.readthedocs.io/en/latest/user_guides/useful_tools.html#id8) for more information on `tools/analysis_tools/optimize_anchors.py`
```
## 8. Visualization the data processing part of config
The script `tools/analysis_tools/browse_dataset.py` allows you to visualize the data processing part of config directly in the window, with the option to save the visualization to a specific directory.
Let's use the config file we just created `configs/custom_dataset/yolov5_s-v61_syncbn_fast_1xb32-100e_cat.py` to visualize the images. Each image lasts for `3` seconds, and the images are not saved:
See [Visualizing Datasets](https://mmyolo.readthedocs.io/en/latest/user_guides/useful_tools.html#id3) for more information on `tools/analysis_tools/browse_dataset.py`
```
## 9. Train
Here are three points to explain:
1. Training visualization
2. YOLOv5 model training
3. Switching YOLO model training
### 9.1 Training visualization
If you need to use a browser to visualize the training process, MMYOLO currently offers two ways [wandb](https://wandb.ai/site) and [TensorBoard](https://tensorflow.google.cn/tensorboard). Pick one according to your own situation (we'll expand support for more visualization backends in the future).
#### 9.1.1 wandb
Wandb visualization need registered in [website](https://wandb.ai/site), and in the https://wandb.ai/settings for wandb API Keys.
After running the training command, Tensorboard files will be generated in the visualization folder `work_dirs/yolov5_s-v61_syncbn_fast_1xb32-100e_cat/${TIMESTAMP}/vis_data`. We can use Tensorboard to view the loss, learning rate, and coco/bbox_mAP visualizations from a web link by running the following command:
The following is `1 x 3080Ti`, `batch size = 32`, training `100 epoch` optimal precision weight `work_dirs/yolov5_s-v61_syncbn_fast_1xb32-100e_cat/best_coco/bbox_mAP_epoch_98.pth` obtained accuracy (see Appendix for detailed machine information):
```shell
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.968
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
In general finetune best practice, it is recommended that backbone be left out of training and that the learning rate lr be scaled accordingly. However, in this tutorial, we found this approach can fall short to some extent. The possible reason is that the cat category is already in the COCO dataset, and the cat dataset used in this tutorial is relatively small
```
The following table shows the test accuracy of the MMYOLO YOLOv5 pre-trained model `yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.pth` without finetune on the cat dataset. It can be seen that the mAP of the `cat` category is only `0.866`, which improve to `0.968` after finetune, improved by '10.2%', which proves that the training was very successful:
For details on how to get the accuracy of the pre-trained weights, see the appendix【2. How to test the accuracy of dataset on pre-trained weights】
```
### 9.3 Switch other models in MMYOLO
MMYOLO integrates multiple YOLO algorithms, which makes switching between YOLO models very easy. There is no need to reacquaint with a new repo. You can easily switch between YOLO models by simply modifying the config file:
# load_from can specify a local path or URL, setting the URL will automatically download, because the above has been downloaded, we set the local path here
# since this tutorial is fine-tuning on the cat dataset, we need to use `load_from` to load the pre-trained model from MMYOLO. This allows for faster convergence and accuracy
# according to your GPU situation, modify the batch size, and YOLOv6-s defaults to 8 cards x 32bs
train_batch_size_per_gpu = 32
train_num_workers = 4 # recommend to use train_num_workers = nGPU x 4
save_epoch_intervals = 2 # save weights every interval round
# according to your GPU situation, modify the base_lr, modification ratio is base_lr_default * (your_bs / default_bs)
base_lr = _base_.base_lr / 8
class_name = ('cat', ) # according to the label information of class_with_id.txt, set the class_name
num_classes = len(class_name)
metainfo = dict(
classes=class_name,
palette=[(220, 20, 60)] # the color of drawing, free to set
)
train_cfg = dict(
max_epochs=max_epochs,
val_begin=20, # number of epochs to start validation. Here 20 is set because the accuracy of the first 20 epochs is not high and the test is not meaningful, so it is skipped
val_interval=save_epoch_intervals, # the test evaluation is performed iteratively every val_interval round
# set how many epochs to save the model, and the maximum number of models to save,`save_best` is also the best model (recommended).
checkpoint=dict(
type='CheckpointHook',
interval=save_epoch_intervals,
max_keep_ckpts=5,
save_best='auto'),
param_scheduler=dict(max_epochs=max_epochs),
# logger output interval
logger=dict(type='LoggerHook', interval=10))
custom_hooks = [
dict(
type='EMAHook',
ema_type='ExpMomentumEMA',
momentum=0.0001,
update_buffers=True,
strict_load=False,
priority=49),
dict(
type='mmdet.PipelineSwitchHook',
switch_epoch=max_epochs - _base_.num_last_epochs,
switch_pipeline=_base_.train_pipeline_stage2)
]
```
```{Note}
Similarly, We put an identical config file in `projects/misc/custom_dataset/yolov6_s_syncbn_fast_1xb32-100e_cat.py`. You can choose to copy to `configs/custom_dataset/yolov6_s_syncbn_fast_1xb32-100e_cat.py` to start training directly.
Even though the new config looks like a lot of stuff, it's actually a lot of duplication. You can use a comparison software to see that most of the configuration is identical to 'yolov5_s-v61_syncbn_fast_1xb32-100e_cat.py'. Because the two config files need to inherit from different config files, you still need to add the necessary configuration.
The above demonstrates how to switch models in MMYOLO, you can quickly compare the accuracy of different models, and the model with high accuracy can be put into production. In my experiment, the best accuracy of YOLOv6 `0.9870` is `1.9 %` higher than the best accuracy of YOLOv5 `0.9680` , so we will use YOLOv6 for explanation.
## 10. Inference
Using the best model for inference, the best model path in the following command is `./work_dirs/yolov6_s_syncbn_fast_1xb32-100e_cat/best_coco/bbox_mAP_epoch_96.pth`, please modify the best model path you trained.
If the inference result is not ideal, here are two cases:
1. Model underfitting:
First, we need to determine if there is not enough training epochs resulting in underfitting. If there is not enough training, we need to change the `max_epochs` and `work_dir` parameters in the config file, or create a new config file named as above and start the training again.
2. The dataset needs to be optimized:
If adding epochs still doesn't work, we can increase the number of datasets and re-examine and refine the annotations of the dataset before retraining.
```
## 11. Deployment
MMYOLO provides two deployment options:
1. [MMDeploy](https://github.com/open-mmlab/mmdeploy) framework for deployment
2. Using `projects/easydeploy` to deployment
### 11.1 MMDeploy framework for deployment
Considering that the wide variety of machine deployments, there are many times when a local machine will work, but not in production. Here, we recommended to use Docker, so that the environment can be deployed once and used for life, saving the time of operation and maintenance to build the environment and deploy production.
In this part, we will introduce the following steps:
1. Building a Docker image
2. Creating a Docker container
3. Transforming TensorRT models
4. Deploying model and performing inference
```{SeeAlso}
If you are not familiar with Docker, you can refer to the MMDeploy [source manual installation].(https://mmdeploy.readthedocs.io/en/latest/01-how-to-build/build_from_source.html) file to compile directly locally. Once installed, you can skip to【11.1.3 Transforming TensorRT models】
You can read more about this in the MMDeploy official documentation [Using Docker Images](https://mmdeploy.readthedocs.io/en/latest/01-how-to-build/build_from_docker.html#docker)
```
#### 11.1.3 Transforming TensorRT models
The first step is to install MMYOLO and `pycuda` in a Docker container:
```shell
export MMYOLO_PATH=/root/workspace/mmyolo # path in the image, which doesn't need to modify
cd ${MMYOLO_PATH}
export MMYOLO_VERSION=$(python -c "import mmyolo.version as v; print(v.__version__)") # Check the version number of MMYOLO used for training
Looking at the exported path, you can see the file structure as shown in the following screenshot:
```shell
$WORK_DIR
├── deploy.json
├── detail.json
├── end2end.engine
├── end2end.onnx
└── pipeline.json
```
```{SeeAlso}
For a detailed description of transforming models, see [How to Transform Models](https://mmdeploy.readthedocs.io/en/latest/02-how-to-run/convert_model.html)
```
#### 11.1.4 Deploying model and performing inference
We need to change the `data_root` in `${MMYOLO_PATH}/configs/custom_dataset/yolov6_s_syncbn_fast_1xb32-100e_cat.py` to the path in the Docker container:
```python
data_root = '/root/workspace/mmyolo/data/cat/' # absolute path of the dataset dir in the Docker container.
The speed test is as follows, we can see that the average inference speed is `24.10ms`, which is a speed improvement compared to PyTorch inference, but also reduce lots of video memory usage:
The script `deploy_demo.py` doesn't achieve batch inference, and the pre-processing code needs to be improved. It cannot fully show the inference speed at the moment, only demonstrate the inference results. we will optimize in the future. Expect!
```
After executing, you can see the inference image results in `--out-dir` :
Because `configs/yolov5/yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py` is inherited from `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py`. Therefore, you can mainly modify the `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` file.
Also, change the `category_id` in the `annotations` to the `id` corresponding to COCO, for example, `cat` is `17` in this example. Here are some of the results:
```json
"annotations": [
{
"iscrowd": 0,
"category_id": 17, # This "category_id" is changed to the id corresponding to COCO, for example, cat is 17