mirror of https://github.com/open-mmlab/mmyolo.git
* add * reproduce map * add typehint and doc * format code * replace key * add ut * format * format * format code * fix ut * fix ut * fix comment * fix comment * fix comment * [WIP][Feature] Support yolov5-Ins training * fix comment * change data flow and fix loss_mask compute * align the data pipeline * remove albu gt mask key * support yolov5 ins inference * fix multi gpu test * align the post_process with v8 * support training * support training * code formatting * code formatting * Support pad_param type (#672) * add half_pad_param * fix default fast_test * fix loss weight compute * fix mask rescale, add segment merge, fix segment2bbox * fix clip and fix mask init * code formatting * code formatting * code formatting * code formatting * [Fix] fix load image from file * [Add] Add docs and more config * [Fix] config type and test_formatting * [Fix] fix yolov5-ins_m packdetinputs * update --------- Co-authored-by: Nioolek <379319054@qq.com> Co-authored-by: Nioolek <40284075+Nioolek@users.noreply.github.com> Co-authored-by: huanghaian <huanghaian@sensetime.com> |
||
---|---|---|
.. | ||
crowdhuman | ||
ins_seg | ||
mask_refine | ||
voc | ||
yolov5u | ||
README.md | ||
metafile.yml | ||
yolov5_l-p6-v62_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_l-v61_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_m-p6-v62_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_m-v61_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_n-p6-v62_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_n-v61_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_s-p6-v62_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_s-v61_fast_1xb12-40e_608x352_cat.py | ||
yolov5_s-v61_fast_1xb12-40e_cat.py | ||
yolov5_s-v61_fast_1xb12-ms-40e_cat.py | ||
yolov5_s-v61_syncbn-detect_8xb16-300e_coco.py | ||
yolov5_s-v61_syncbn_8xb16-300e_coco.py | ||
yolov5_s-v61_syncbn_fast_1xb4-300e_balloon.py | ||
yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_x-p6-v62_syncbn_fast_8xb16-300e_coco.py | ||
yolov5_x-v61_syncbn_fast_8xb16-300e_coco.py |
README.md
YOLOv5
Abstract
YOLOv5 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.


Results and models
COCO
Backbone | Arch | size | Mask Refine | SyncBN | AMP | Mem (GB) | box AP | TTA box AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv5-n | P5 | 640 | No | Yes | Yes | 1.5 | 28.0 | 30.7 | config | model | log |
YOLOv5-n | P5 | 640 | Yes | Yes | Yes | 1.5 | 28.0 | config | model | log | |
YOLOv5u-n | P5 | 640 | Yes | Yes | Yes | config | model | log | |||
YOLOv5-s | P5 | 640 | No | Yes | Yes | 2.7 | 37.7 | 40.2 | config | model | log |
YOLOv5-s | P5 | 640 | Yes | Yes | Yes | 2.7 | 38.0 (+0.3) | config | model | log | |
YOLOv5u-s | P5 | 640 | Yes | Yes | Yes | config | model | log | |||
YOLOv5-m | P5 | 640 | No | Yes | Yes | 5.0 | 45.3 | 46.9 | config | model | log |
YOLOv5-m | P5 | 640 | Yes | Yes | Yes | 5.0 | 45.3 | config | model | log | |
YOLOv5u-m | P5 | 640 | Yes | Yes | Yes | config | model | log | |||
YOLOv5-l | P5 | 640 | No | Yes | Yes | 8.1 | 48.8 | 49.9 | config | model | log |
YOLOv5-l | P5 | 640 | Yes | Yes | Yes | 8.1 | 49.3 (+0.5) | config | model | log | |
YOLOv5u-l | P5 | 640 | Yes | Yes | Yes | config | model | log | |||
YOLOv5-x | P5 | 640 | No | Yes | Yes | 12.2 | 50.2 | config | model | log | |
YOLOv5-x | P5 | 640 | Yes | Yes | Yes | 12.2 | 50.9 (+0.7) | config | model | log | |
YOLOv5u-x | P5 | 640 | Yes | Yes | Yes | config | model | log | |||
YOLOv5-n | P6 | 1280 | No | Yes | Yes | 5.8 | 35.9 | config | model | log | |
YOLOv5-s | P6 | 1280 | No | Yes | Yes | 10.5 | 44.4 | config | model | log | |
YOLOv5-m | P6 | 1280 | No | Yes | Yes | 19.1 | 51.3 | config | model | log | |
YOLOv5-l | P6 | 1280 | No | Yes | Yes | 30.5 | 53.7 | config | model | log |
Note:
fast
means thatYOLOv5DetDataPreprocessor
andyolov5_collate
are used for data preprocessing, which is faster for training, but less flexible for multitasking. Recommended to use fast version config if you only care about object detection.detect
means that the network input is fixed to640x640
and the post-processing thresholds is modified.SyncBN
means use SyncBN,AMP
indicates training with mixed precision.- We use 8x A100 for training, and the single-GPU batch size is 16. This is different from the official code.
- The performance is unstable and may fluctuate by about 0.4 mAP and the highest performance weight in
COCO
training inYOLOv5
may not be the last epoch. TTA
means that Test Time Augmentation. It's perform 3 multi-scaling transformations on the image, followed by 2 flipping transformations (flipping and not flipping). You only need to specify--tta
when testing to enable. see TTA for details.- The performance of
Mask Refine
training is for the weight performance officially released by YOLOv5.Mask Refine
means refining bbox by mask while loading annotations and transforming afterYOLOv5RandomAffine
,Copy Paste
means usingYOLOv5CopyPaste
. YOLOv5u
models use the same loss functions and split Detect head asYOLOv8
models for improved performance, but only requires 300 epochs.
COCO Instance segmentation
Backbone | Arch | size | SyncBN | AMP | Mem (GB) | Box AP | Mask AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|
YOLOv5-n | P5 | 640 | Yes | Yes | 3.3 | 27.9 | 23.7 | config | model | log |
YOLOv5-s | P5 | 640 | Yes | Yes | 4.8 | 38.1 | 32.0 | config | model | log |
YOLOv5-s(non-overlap) | P5 | 640 | Yes | Yes | 4.8 | 38.0 | 32.1 | config | model | log |
YOLOv5-m | P5 | 640 | Yes | Yes | 7.3 | 45.1 | 37.3 | config | model | log |
Note:
Non-overlap
refers to the instance-level masks being stored in the format (num_instances, h, w) instead of (h, w). Storing masks in overlap format consumes less memory and GPU memory.- We found that the mAP of the N/S/M model is higher than the official version, but the L/X model is lower than the official version. We will resolve this issue as soon as possible.
VOC
Backbone | size | Batchsize | AMP | Mem (GB) | box AP(COCO metric) | Config | Download |
---|---|---|---|---|---|---|---|
YOLOv5-n | 512 | 64 | Yes | 3.5 | 51.2 | config | model | log |
YOLOv5-s | 512 | 64 | Yes | 6.5 | 62.7 | config | model | log |
YOLOv5-m | 512 | 64 | Yes | 12.0 | 70.1 | config | model | log |
YOLOv5-l | 512 | 32 | Yes | 10.0 | 73.1 | config | model | log |
Note:
- Training on VOC dataset need pretrained model which trained on COCO.
- The performance is unstable and may fluctuate by about 0.4 mAP.
- Official YOLOv5 use COCO metric, while training VOC dataset.
- We converted the VOC test dataset to COCO format offline, while reproducing mAP result as shown above. We will support to use COCO metric while training VOC dataset in later version.
- Hyperparameter reference from
https://wandb.ai/glenn-jocher/YOLOv5_VOC_official
.
CrowdHuman
Since the iscrowd
annotation of the COCO dataset is not equivalent to ignore
, we use the CrowdHuman dataset to verify that the YOLOv5 ignore logic is correct.
Backbone | size | SyncBN | AMP | Mem (GB) | ignore_iof_thr | box AP50(CrowDHuman Metric) | MR | JI | Config | Download |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv5-s | 640 | Yes | Yes | 2.6 | -1 | 85.79 | 48.7 | 75.33 | config | |
YOLOv5-s | 640 | Yes | Yes | 2.6 | 0.5 | 86.17 | 48.8 | 75.87 | config |
Note:
ignore_iof_thr
is -1 indicating that the ignore tag is not considered. We adjusted withignore_iof_thr
thresholds of 0.5, 0.8, 0.9, and the results show that 0.5 has the best performance.- The above table shows the performance of the model with the best performance on the validation set. The best performing models are around 160+ epoch which means that there is no need to train so many epochs.
- This is a very simple implementation that simply replaces COCO's anchor with the
tools/analysis_tools/optimize_anchors.py
script. We'll adjust other parameters later to improve performance.
Citation
@software{glenn_jocher_2022_7002879,
author = {Glenn Jocher and
Ayush Chaurasia and
Alex Stoken and
Jirka Borovec and
NanoCode012 and
Yonghye Kwon and
TaoXie and
Kalen Michael and
Jiacong Fang and
imyhxy and
Lorna and
Colin Wong and
曾逸夫(Zeng Yifu) and
Abhiram V and
Diego Montes and
Zhiqiang Wang and
Cristi Fati and
Jebastin Nadar and
Laughing and
UnglvKitDe and
tkianai and
yxNONG and
Piotr Skalski and
Adam Hogan and
Max Strobel and
Mrinal Jain and
Lorenzo Mammana and
xylieong},
title = {{ultralytics/yolov5: v6.2 - YOLOv5 Classification
Models, Apple M1, Reproducibility, ClearML and
Deci.ai integrations}},
month = aug,
year = 2022,
publisher = {Zenodo},
version = {v6.2},
doi = {10.5281/zenodo.7002879},
url = {https://doi.org/10.5281/zenodo.7002879}
}