History

JosonChan 600343eb08 [Feature] Support YOLOv5 instance segmentation (#735 ) * add * reproduce map * add typehint and doc * format code * replace key * add ut * format * format * format code * fix ut * fix ut * fix comment * fix comment * fix comment * [WIP][Feature] Support yolov5-Ins training * fix comment * change data flow and fix loss_mask compute * align the data pipeline * remove albu gt mask key * support yolov5 ins inference * fix multi gpu test * align the post_process with v8 * support training * support training * code formatting * code formatting * Support pad_param type (#672) * add half_pad_param * fix default fast_test * fix loss weight compute * fix mask rescale, add segment merge, fix segment2bbox * fix clip and fix mask init * code formatting * code formatting * code formatting * code formatting * [Fix] fix load image from file * [Add] Add docs and more config * [Fix] config type and test_formatting * [Fix] fix yolov5-ins_m packdetinputs * update --------- Co-authored-by: Nioolek <379319054@qq.com> Co-authored-by: Nioolek <40284075+Nioolek@users.noreply.github.com> Co-authored-by: huanghaian <huanghaian@sensetime.com>		2023-04-27 14:47:52 +08:00
..
crowdhuman	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
ins_seg	[Feature] Support YOLOv5 instance segmentation (#735 )	2023-04-27 14:47:52 +08:00
mask_refine	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
voc	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
yolov5u	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
README.md	[Feature] Support YOLOv5 instance segmentation (#735 )	2023-04-27 14:47:52 +08:00
metafile.yml	[Feature] Support YOLOv5 instance segmentation (#735 )	2023-04-27 14:47:52 +08:00
yolov5_l-p6-v62_syncbn_fast_8xb16-300e_coco.py	[Feature] Support P6 YOLOv5 (#168 )	2022-11-03 19:03:06 +08:00
yolov5_l-v61_syncbn_fast_8xb16-300e_coco.py	…
yolov5_m-p6-v62_syncbn_fast_8xb16-300e_coco.py	Beautify the YOLOv5 configuration (#501 )	2023-02-03 14:28:35 +08:00
yolov5_m-v61_syncbn_fast_8xb16-300e_coco.py	Beautify the YOLOv5 configuration (#501 )	2023-02-03 14:28:35 +08:00
yolov5_n-p6-v62_syncbn_fast_8xb16-300e_coco.py	[Feature] Support P6 YOLOv5 (#168 )	2022-11-03 19:03:06 +08:00
yolov5_n-v61_syncbn_fast_8xb16-300e_coco.py	…
yolov5_s-p6-v62_syncbn_fast_8xb16-300e_coco.py	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
yolov5_s-v61_fast_1xb12-40e_608x352_cat.py	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
yolov5_s-v61_fast_1xb12-40e_cat.py	Add change log of v0.5.0 (#612 )	2023-03-02 17:09:50 +08:00
yolov5_s-v61_fast_1xb12-ms-40e_cat.py	[Docs] Add Multi-scale training and testing (#630 )	2023-03-07 16:18:43 +08:00
yolov5_s-v61_syncbn-detect_8xb16-300e_coco.py	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
yolov5_s-v61_syncbn_8xb16-300e_coco.py	Support setting the cache_size_limit parameter and support mmdet 3.0.0 (#707 )	2023-04-18 10:59:52 +08:00
yolov5_s-v61_syncbn_fast_1xb4-300e_balloon.py	Beautify the YOLOv5 configuration (#501 )	2023-02-03 14:28:35 +08:00
yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py	…
yolov5_x-p6-v62_syncbn_fast_8xb16-300e_coco.py	[Feature] Support P6 YOLOv5 (#168 )	2022-11-03 19:03:06 +08:00
yolov5_x-v61_syncbn_fast_8xb16-300e_coco.py	…

README.md

YOLOv5

Abstract

YOLOv5 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.

YOLOv5-l-P5 model structure

YOLOv5-l-P6 model structure

Results and models

COCO

Backbone	Arch	size	Mask Refine	SyncBN	AMP	Mem (GB)	box AP	TTA box AP	Config	Download
YOLOv5-n	P5	640	No	Yes	Yes	1.5	28.0	30.7	config	model \| log
YOLOv5-n	P5	640	Yes	Yes	Yes	1.5	28.0		config	model \| log
YOLOv5u-n	P5	640	Yes	Yes	Yes				config	model \| log
YOLOv5-s	P5	640	No	Yes	Yes	2.7	37.7	40.2	config	model \| log
YOLOv5-s	P5	640	Yes	Yes	Yes	2.7	38.0 (+0.3)		config	model \| log
YOLOv5u-s	P5	640	Yes	Yes	Yes				config	model \| log
YOLOv5-m	P5	640	No	Yes	Yes	5.0	45.3	46.9	config	model \| log
YOLOv5-m	P5	640	Yes	Yes	Yes	5.0	45.3		config	model \| log
YOLOv5u-m	P5	640	Yes	Yes	Yes				config	model \| log
YOLOv5-l	P5	640	No	Yes	Yes	8.1	48.8	49.9	config	model \| log
YOLOv5-l	P5	640	Yes	Yes	Yes	8.1	49.3 (+0.5)		config	model \| log
YOLOv5u-l	P5	640	Yes	Yes	Yes				config	model \| log
YOLOv5-x	P5	640	No	Yes	Yes	12.2	50.2		config	model \| log
YOLOv5-x	P5	640	Yes	Yes	Yes	12.2	50.9 (+0.7)		config	model \| log
YOLOv5u-x	P5	640	Yes	Yes	Yes				config	model \| log
YOLOv5-n	P6	1280	No	Yes	Yes	5.8	35.9		config	model \| log
YOLOv5-s	P6	1280	No	Yes	Yes	10.5	44.4		config	model \| log
YOLOv5-m	P6	1280	No	Yes	Yes	19.1	51.3		config	model \| log
YOLOv5-l	P6	1280	No	Yes	Yes	30.5	53.7		config	model \| log

Note:

fast means that YOLOv5DetDataPreprocessor and yolov5_collate are used for data preprocessing, which is faster for training, but less flexible for multitasking. Recommended to use fast version config if you only care about object detection.
detect means that the network input is fixed to 640x640 and the post-processing thresholds is modified.
SyncBN means use SyncBN, AMP indicates training with mixed precision.
We use 8x A100 for training, and the single-GPU batch size is 16. This is different from the official code.
The performance is unstable and may fluctuate by about 0.4 mAP and the highest performance weight in COCO training in YOLOv5 may not be the last epoch.
TTA means that Test Time Augmentation. It's perform 3 multi-scaling transformations on the image, followed by 2 flipping transformations (flipping and not flipping). You only need to specify --tta when testing to enable. see TTA for details.
The performance of Mask Refine training is for the weight performance officially released by YOLOv5. Mask Refine means refining bbox by mask while loading annotations and transforming after YOLOv5RandomAffine, Copy Paste means using YOLOv5CopyPaste.
YOLOv5u models use the same loss functions and split Detect head as YOLOv8 models for improved performance, but only requires 300 epochs.

COCO Instance segmentation

Backbone	Arch	size	SyncBN	AMP	Mem (GB)	Box AP	Mask AP	Config	Download
YOLOv5-n	P5	640	Yes	Yes	3.3	27.9	23.7	config	model \| log
YOLOv5-s	P5	640	Yes	Yes	4.8	38.1	32.0	config	model \| log
YOLOv5-s(non-overlap)	P5	640	Yes	Yes	4.8	38.0	32.1	config	model \| log
YOLOv5-m	P5	640	Yes	Yes	7.3	45.1	37.3	config	model \| log

Note:

Non-overlap refers to the instance-level masks being stored in the format (num_instances, h, w) instead of (h, w). Storing masks in overlap format consumes less memory and GPU memory.
We found that the mAP of the N/S/M model is higher than the official version, but the L/X model is lower than the official version. We will resolve this issue as soon as possible.

VOC

Backbone	size	Batchsize	AMP	Mem (GB)	box AP(COCO metric)	Config	Download
YOLOv5-n	512	64	Yes	3.5	51.2	config	model \| log
YOLOv5-s	512	64	Yes	6.5	62.7	config	model \| log
YOLOv5-m	512	64	Yes	12.0	70.1	config	model \| log
YOLOv5-l	512	32	Yes	10.0	73.1	config	model \| log

Note:

Training on VOC dataset need pretrained model which trained on COCO.
The performance is unstable and may fluctuate by about 0.4 mAP.
Official YOLOv5 use COCO metric, while training VOC dataset.
We converted the VOC test dataset to COCO format offline, while reproducing mAP result as shown above. We will support to use COCO metric while training VOC dataset in later version.
Hyperparameter reference from https://wandb.ai/glenn-jocher/YOLOv5_VOC_official.

CrowdHuman

Since the iscrowd annotation of the COCO dataset is not equivalent to ignore, we use the CrowdHuman dataset to verify that the YOLOv5 ignore logic is correct.

Backbone	size	SyncBN	AMP	Mem (GB)	ignore_iof_thr	box AP50(CrowDHuman Metric)	MR	JI	Config	Download
YOLOv5-s	640	Yes	Yes	2.6	-1	85.79	48.7	75.33	config
YOLOv5-s	640	Yes	Yes	2.6	0.5	86.17	48.8	75.87	config

Note:

ignore_iof_thr is -1 indicating that the ignore tag is not considered. We adjusted with ignore_iof_thr thresholds of 0.5, 0.8, 0.9, and the results show that 0.5 has the best performance.
The above table shows the performance of the model with the best performance on the validation set. The best performing models are around 160+ epoch which means that there is no need to train so many epochs.
This is a very simple implementation that simply replaces COCO's anchor with the tools/analysis_tools/optimize_anchors.py script. We'll adjust other parameters later to improve performance.

Citation

@software{glenn_jocher_2022_7002879,
  author       = {Glenn Jocher and
                  Ayush Chaurasia and
                  Alex Stoken and
                  Jirka Borovec and
                  NanoCode012 and
                  Yonghye Kwon and
                  TaoXie and
                  Kalen Michael and
                  Jiacong Fang and
                  imyhxy and
                  Lorna and
                  Colin Wong and
                  曾逸夫(Zeng Yifu) and
                  Abhiram V and
                  Diego Montes and
                  Zhiqiang Wang and
                  Cristi Fati and
                  Jebastin Nadar and
                  Laughing and
                  UnglvKitDe and
                  tkianai and
                  yxNONG and
                  Piotr Skalski and
                  Adam Hogan and
                  Max Strobel and
                  Mrinal Jain and
                  Lorenzo Mammana and
                  xylieong},
  title        = {{ultralytics/yolov5: v6.2 - YOLOv5 Classification
                   Models, Apple M1, Reproducibility, ClearML and
                   Deci.ai integrations}},
  month        = aug,
  year         = 2022,
  publisher    = {Zenodo},
  version      = {v6.2},
  doi          = {10.5281/zenodo.7002879},
  url          = {https://doi.org/10.5281/zenodo.7002879}
}