Note: Since this repository uses OpenMMLab 2.0, please create a new conda virtual environment to prevent conflicts with your existing repositories and projects of OpenMMLab 1.0.
# "-e" means install the project in editable mode, so any local modifications made to the code will take effect, eliminating the need to reinstall.
```
For more detailed information about environment configuration, please refer to [get_started](../get_started.md).
## Dataset Preparation
In this tutorial, we provide the ballon dataset, which is less than 40MB, as the training dataset for MMYOLO.
```shell
python tools/misc/download_dataset.py --dataset-name balloon --save-dir data --unzip
python tools/dataset_converters/balloon2coco.py
```
After executing the above command, the balloon dataset will be downloaded in the `data` folder with the converted format we need. The `train.json` and `val.json` are the annotation files in the COCO format.
Create a new file called the `yolov5_s-v61_syncbn_fast_1xb4-300e_balloon.py` configuration file in the `configs/yolov5` folder, and copy the following content into it.
model = dict(bbox_head=dict(head_module=dict(num_classes=1)))
default_hooks = dict(logger=dict(interval=1))
```
The above configuration is inherited from `./yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py`, and `data_root`, `metainfo`, `train_dataloader`, `val_dataloader`, `num_classes` and other configurations are updated according to the balloon data we are using.
The reason why we set the `interval` of the logger to 1 is that the balloon data set we choose is relatively small, and if the `interval` is too large, we will not see the output of the loss-related log. Therefore, by setting the `interval` of the logger to 1 will ensure that each interval iteration will output a loss-related log.
After executing the above training command, the `work_dirs/yolov5_s-v61_syncbn_fast_1xb4-300e_balloon` folder will be automatically generated. Both the weight and the training configuration files will be saved in this folder.
If training stops midway, add `--resume` at the end of the training command, and the program will automatically load the latest weight file from `work_dirs` to resume training.
NOTICE: It is highly recommended that finetuning from large datasets, such as COCO, can significantly boost the performance of overall network.
In this example, compared with training from scratch, finetuning the pretrained model outperforms with a significant margin. (Over 30+ mAP boost than training from scratch).
For `visualization` of `default_hooks` in `configs/yolov5/yolov5_s-v61_syncbn_fast_1xb4-300e_balloon.py`, we set `draw` to `True` and `interval` to `2`.
```python
default_hooks = dict(
logger=dict(interval=1),
visualization=dict(draw=True, interval=2),
)
```
Re-run the following training command. During the validation, each `interval` image will save a puzzle of the annotation and prediction results to `work_dirs/yolov5_s-v61_syncbn_fast_1xb4-300e_balloon/{timestamp}/vis_data/vis_image` folder.
Re-run the training command to check data visualization results such as loss, learning rate, and coco/bbox_mAP in the web link prompted on the command line.
Re-run the training command, a Tensorboard folder will be created in `work_dirs/yolov5_s-v61_syncbn_fast_1xb4-300e_balloon/{timestamp}/vis_data`, You can get data visualization results such as loss, learning rate, and coco/bbox_mAP in the web link prompted on the command line with the following command:
Run the above command, the inference result picture will be automatically saved to the `work_dirs/yolov5_s-v61_syncbn_fast_1xb4-300e_balloon/{timestamp}/show_results` folder. The following is one of the result pictures. The left one is the actual annotation, and the right is the model inference result.