148 Commits

Author SHA1 Message Date
Glenn Jocher
e899d6e8fb
Fix for corrupt JPEGs auto-fix PR (#4560)
Auto-fix corrupt JPEGs PR introduced a bug whereby the f.seek() operation read all of the bytes in the image, resulting in the PIL image having nothing to read upon the .save() operation. 

Fix was to re-open the image using PIL before saving.
2021-08-27 13:01:21 +02:00
Glenn Jocher
11f85e7e71
Auto-fix corrupt JPEGs (#4548)
* Autofix corrupt JPEGs

This PR automatically re-saves corrupt JPEGs and trains with the resaved images. WARNING: this will overwrite the existing corrupt JPEGs in a dataset and replace them with correct JPEGs, though the filesize may increase and the image contents may not be exactly the same due to lossy JPEG compression schemes. Results may vary by JPEG decoder and hardware.

Current behavior is to exclude corrupt JPEGs from training with a warning to the user, but many users have been complaining about large parts of their dataset being excluded from training.

* Clarify re-save reason
2021-08-26 15:51:04 +02:00
Huu Quan, CAP
1d65e8194d
Fix missing labels after albumentations (#4455)
* fix missing labels after augmentation

* Update datasets.py

Cleanup

Co-authored-by: Huu Quan <huuquan@HuuQuans-MacBook.local>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-18 12:07:09 +02:00
Jiacong Fang
808bcad3bb
Add TensorFlow and TFLite export (#1127)
* Add models/tf.py for TensorFlow and TFLite export

* Set auto=False for int8 calibration

* Update requirements.txt for TensorFlow and TFLite export

* Read anchors directly from PyTorch weights

* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export

* Remove check_anchor_order, check_file, set_logging from import

* Reformat code and optimize imports

* Autodownload model and check cfg

* update --source path, img-size to 320, single output

* Adjust representative_dataset

* Put representative dataset in tfl_int8 block

* detect.py TF inference

* weights to string

* weights to string

* cleanup tf.py

* Add --dynamic-batch-size

* Add xywh normalization to reduce calibration error

* Update requirements.txt

TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error

* Fix imports

Move C3 from models.experimental to models.common

* Add models/tf.py for TensorFlow and TFLite export

* Set auto=False for int8 calibration

* Update requirements.txt for TensorFlow and TFLite export

* Read anchors directly from PyTorch weights

* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export

* Remove check_anchor_order, check_file, set_logging from import

* Reformat code and optimize imports

* Autodownload model and check cfg

* update --source path, img-size to 320, single output

* Adjust representative_dataset

* detect.py TF inference

* Put representative dataset in tfl_int8 block

* weights to string

* weights to string

* cleanup tf.py

* Add --dynamic-batch-size

* Add xywh normalization to reduce calibration error

* Update requirements.txt

TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error

* Fix imports

Move C3 from models.experimental to models.common

* implement C3() and SiLU()

* Fix reshape dim to support dynamic batching

* Add epsilon argument in tf_BN, which is different between TF and PT

* Set stride to None if not using PyTorch, and do not warmup without PyTorch

* Add list support in check_img_size()

* Add list input support in detect.py

* sys.path.append('./') to run from yolov5/

* Add int8 quantization support for TensorFlow 2.5

* Add get_coco128.sh

* Remove --no-tfl-detect in models/tf.py (Use tf-android-tfl-detect branch for EdgeTPU)

* Update requirements.txt

* Replace torch.load() with attempt_load()

* Update requirements.txt

* Add --tf-raw-resize to set half_pixel_centers=False

* Add --agnostic-nms for TF class-agnostic NMS

* Cleanup after merge

* Cleanup2 after merge

* Cleanup3 after merge

* Add tf.py docstring with credit and usage

* pb saved_model and tflite use only one model in detect.py

* Add use cases in docstring of tf.py

* Remove redundant `stride` definition

* Remove keras direct import

* Fix `check_requirements(('tensorflow>=2.4.1',))`

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-17 13:18:16 +02:00
Glenn Jocher
24bea5e4b7
Standardize headers and docstrings (#4417)
* Implement new headers

* Reformat 1

* Reformat 2

* Reformat 3 - math

* Reformat 4 - yaml
2021-08-14 21:17:51 +02:00
Glenn Jocher
63e09fdc48
Remove encoding='ascii' (#4413)
* Remove `encoding='ascii'`

* Reinstate `encoding='ascii'` in emojis()
2021-08-14 13:47:20 +02:00
junji hashimoto
2d99063201
Feature python train.py --cache disk (#4049)
* Add cache-on-disk and cache-directory to cache images on disk

* Fix load_image with cache_on_disk

* Add no_cache flag for load_image

* Revert the parts('logging' and a new line) that do not need to be modified

* Add the assertion for shapes of cached images

* Add a suffix string for cached images

* Fix boundary-error of letterbox for load_mosaic

* Add prefix as cache-key of cache-on-disk

* Update cache-function on disk

* Add psutil in requirements.txt

* Update train.py

* Cleanup1

* Cleanup2

* Skip existing npy

* Include re-space

* Export return character fix

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-02 18:47:24 +02:00
Glenn Jocher
5d66e48723
Train from --data path/to/dataset.zip feature (#4185)
* Train from `--data path/to/dataset.zip` feature

* Update dataset_stats()

* cleanup

* cleanup2
2021-07-28 02:04:10 +02:00
Glenn Jocher
0ad6301c96
Update script headers (#4163)
* Update download script headers

* cleanup

* bug fix attempt

* bug fix attempt2

* bug fix attempt3

* cleanup
2021-07-26 15:23:33 +02:00
Glenn Jocher
f7d8562060
val.py refactor (#4053)
* val.py refactor

* cleanup

* cleanup

* cleanup

* cleanup

* save after eval

* opt.imgsz bug fix

* wandb refactor

* dataloader to train_loader

* capitalize global variables

* runs/hub/exp to runs/detect/exp

* refactor wandb logging

* Refactor wandb operations (#4061)

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2021-07-19 10:43:01 +02:00
Glenn Jocher
b3dabdcc38
Update probability to p (#3980) 2021-07-12 15:54:43 +02:00
Glenn Jocher
80299a57e2
Numerical stability fix for Albumentations (#3958) 2021-07-10 19:50:53 +02:00
Glenn Jocher
443af8b25a
Cache v0.4 update (#3954) 2021-07-10 14:18:46 +02:00
Glenn Jocher
8c6f9e15bf
Update dataset_stats() for zipped datasets (#3926)
* Update `dataset_stats()` for zipped datasets

@KalenMike

* cleanup
2021-07-08 11:42:30 +02:00
Glenn Jocher
33202b7f0b
YOLOv5 + Albumentations integration (#3882)
* Albumentations integration

* ToGray p=0.01

* print confirmation

* create instance in dataloader init method

* improved version handling

* transform not defined fix

* assert string update

* create check_version()

* add spaces

* update class comment
2021-07-05 18:01:54 +02:00
Glenn Jocher
3c3f8fbd5d
Improved BGR2RGB speeds (#3880)
* Update BGR2RGB ops

* speed improvements

* cleanup
2021-07-04 20:12:32 +02:00
Glenn Jocher
9e8fb9fd0b
Create utils/augmentations.py (#3877)
* Create `utils/augmentations.py`

* cleanup
2021-07-04 18:14:04 +02:00
ketan-b
9d86b54eb3
Add multi-stream saving feature (#3864)
* Added the recording feature for multiple streams

Thanks for the very cool repo!!
I was trying to record multiple feeds at the same time, but the current version of the detector only had one video writer and one vid_path!
So the streams were not being saved and only were initialized with one frame and this process didn't record the whole thing.

Fix:
I made a list of `vid_writer` and `vid_path` and the `i` from the loop over the `pred` took care of the writer which need to work!

I hope this helps, Thanks!

* Cleanup list lengths

* batch size variable

* Update datasets.py

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-04 12:55:57 +02:00
Valentin Aliferov
831773f5a2
Add EXIF rotation to YOLOv5 Hub inference (#3852)
* rotating an image according to its exif tag

* Update common.py

* Update datasets.py

* Update datasets.py

faster

* delete extraneous gpg file

* Update common.py

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-02 13:25:54 +02:00
Glenn Jocher
c6c88dc601
Copy-Paste augmentation for YOLOv5 (#3845)
* Copy-paste augmentation initial commit

* if any segments

* Add obscuration rejection

* Add copy_paste hyperparameter

* Update comments
2021-07-01 00:35:04 +02:00
Feras Oughali
7d6af69638
Fix LoadStreams() dataloader frame skip issue (#3833)
* Update datasets.py to read every 4th frame of streams

* Update datasets.py

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-30 12:11:29 +02:00
Glenn Jocher
3213d8713f
Fix for dataset_stats() with updated data.yaml (#3819)
@KalenMike
2021-06-29 12:44:59 +02:00
Glenn Jocher
f89941711c
NGA xView 2018 Dataset Auto-Download (#3775)
* update clip_coords for numpy

* uncomment

* cleanup

* Add autosplits

* fix

* cleanup
2021-06-26 00:49:05 +02:00
Yonghye Kwon
374957317a
Add xyxy2xywhn() (#3765)
* Edit Comments for numpy2torch tensor process

Edit Comments for numpy2torch tensor process

* add xyxy2xywhn

add xyxy2xywhn

* add xyxy2xywhn

* formatting

* pass arguments

pass arguments

* edit comment for xyxy2xywhn()

edit comment for xyxy2xywhn()

* cleanup datasets.py

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-25 11:47:46 +02:00
Yonghye Kwon
417a2f425c
Edit comment (#3759)
edit comment
2021-06-24 15:57:27 +02:00
Glenn Jocher
9ac7d388a9
Backwards compatible cache version checks (#3730) 2021-06-22 13:50:47 +02:00
Glenn Jocher
b83e1a4adc
Fix img2label_paths() order (#3720)
* Fix `img2label_paths()` order

* fix, 1
2021-06-21 22:50:56 +02:00
Glenn Jocher
fad27c0046
Update DDP for torch.distributed.run with gloo backend (#3680)
* Update DDP for `torch.distributed.run`

* Add LOCAL_RANK

* remove opt.local_rank

* backend="gloo|nccl"

* print

* print

* debug

* debug

* os.getenv

* gloo

* gloo

* gloo

* cleanup

* fix getenv

* cleanup

* cleanup destroy

* try nccl

* return opt

* add --local_rank

* add timeout

* add init_method

* gloo

* move destroy

* move destroy

* move print(opt) under if RANK

* destroy only RANK 0

* move destroy inside train()

* restore destroy outside train()

* update print(opt)

* cleanup

* nccl

* gloo with 60 second timeout

* update namespace printing
2021-06-19 16:30:25 +02:00
Mai Thanh Minh
bf209f6fe9
Skip HSV augmentation when hyperparameters are [0, 0, 0] (#3686)
* Create shortcircuit in augment_hsv when hyperparameter are zero

* implement faster opt-in

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-19 11:51:21 +02:00
Glenn Jocher
814806c61d
Update cache check (#3691)
Swapped order of operations for faster first per f527704cd3 (r52362419)
2021-06-19 11:22:09 +02:00
Glenn Jocher
f527704cd3
Cache v0.3: improved corrupt image/label reporting (#3676)
* Cache v0.3: improved corrupt image/label reporting

Fix for https://github.com/ultralytics/yolov5/issues/3656#issuecomment-863660899

* cleanup
2021-06-18 10:21:47 +02:00
Glenn Jocher
9b6dba6207
Update dataset_stats() to list of dicts (#3657)
* Update `dataset_stats()` to list of dicts

@KalenMike

* Update datasets.py
2021-06-17 13:59:52 +02:00
xiaowk5516
d808855f77
Assert non-premature end of JPEG images (#3638)
* premature end of JPEG images

* PEP8 reformat

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-16 13:31:26 +02:00
Glenn Jocher
6c0e1d9fd7
Update verify_image_label() (#3635) 2021-06-16 11:12:15 +02:00
Glenn Jocher
7d3686a686
Update check_file() (#3622)
* Update `check_file()`

* Update datasets.py
2021-06-15 13:21:04 +02:00
Glenn Jocher
7a565f130a
Update dataset_stats() (#3593)
@KalenMike this is a PR to add image filenames and labels to our stats dictionary and to save the dictionary to JSON. Save location is next to the train labels.cache file. The single JSON contains all stats for entire dataset.

Usage example:
```python
from utils.datasets import *

dataset_stats('coco128.yaml', verbose=True)
```
2021-06-12 13:26:41 +02:00
Glenn Jocher
958ab92dc1
Remove opt from create_dataloader()` (#3552) 2021-06-09 13:14:56 +02:00
Glenn Jocher
1b5edb6f8e
Update dataset_stats() for HUB (#3536)
* Update `dataset_stats()` for HUB 

Cleanup of b6fdd2e

* autodownload flag

* Update general.py

* cleanup
2021-06-09 10:56:11 +02:00
Glenn Jocher
b6fdd2e5e5
Create dataset_stats() for HUB 2021-06-08 23:09:45 +02:00
Glenn Jocher
8d52c1c5c5
Update datasets.py (#3531)
Minor updates to https://github.com/ultralytics/yolov5/pull/3505, inplace accumulation.
2021-06-08 18:36:40 +02:00
Dean Mark
28bff22df8
Use multi-threading in cache_labels (#3505)
* Use multi threading in cache_labels

* PEP8 reformat

* Add num_threads

* changed ThreadPool.imap_unordered to Pool.imap_unordered

* Remove inplace additions

* Update datasets.py

refactor initial desc

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-08 18:00:21 +02:00
Yonghye Kwon
c37f072ba7
Faster HSV augmentation (#3462)
remove datatype conversion process that can be skipped
2021-06-04 20:02:20 +02:00
Glenn Jocher
8e3b4a0bf3
Update MixUp augmentation alpha=beta=32.0 (#3455)
Per VOC empirical results https://github.com/ultralytics/yolov5/issues/3380#issuecomment-853001307 by @developer0hye
2021-06-04 12:47:53 +02:00
Glenn Jocher
fdbe527dc0
Revert "cv2.imread(img, -1) for IMREAD_UNCHANGED (#3379)" (#3395)
This reverts commit 21a9607e00f1365b21d8c4bd81bdbf5fc0efea24.
2021-05-31 10:39:00 +02:00
tudoulei
21a9607e00
cv2.imread(img, -1) for IMREAD_UNCHANGED (#3379)
* Update datasets.py

* comment

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-05-29 21:12:01 +02:00
Glenn Jocher
4d4a2b0520
Ignore blank lines in *.txt labels (#3366)
Fix for https://github.com/ultralytics/yolov5/issues/958#issuecomment-849512083
2021-05-27 14:31:26 +02:00
Glenn Jocher
c6b5bfca85
Updated cache v0.2 with hashlib (#3350)
* Update cache v0.2 to include parent hash

Possible fix for https://github.com/ultralytics/yolov5/issues/3349

* Update datasets.py
2021-05-26 14:26:52 +02:00
Glenn Jocher
0e2f2cbb51
Update LoadStreams init fallbacks (#3295) 2021-05-23 14:55:42 +02:00
Glenn Jocher
683cefead4
YouTube stream ending fix (#3277)
* YouTube stream ending fix

Properly terminates YouTube streams on video end. Should resolve issues #2769 and #3220.

* Update datasets.py
2021-05-21 16:51:07 +02:00
Glenn Jocher
13a1c72699
Update datasets.py (#3216) 2021-05-17 22:24:26 +02:00