279 Commits

Author SHA1 Message Date
Glenn Jocher
19d03a955c
Remove DDP process group timeout (#4422) 2021-08-15 18:32:41 +02:00
Glenn Jocher
24bea5e4b7
Standardize headers and docstrings (#4417)
* Implement new headers

* Reformat 1

* Reformat 2

* Reformat 3 - math

* Reformat 4 - yaml
2021-08-14 21:17:51 +02:00
Glenn Jocher
ce7deec440
int(mlc) (#4385) 2021-08-11 17:32:13 +02:00
Glenn Jocher
86c7150cfd
Update newline (#4308) 2021-08-04 17:41:38 +02:00
Glenn Jocher
e78aeac973
Evolve in CSV format (#4307)
* Update evolution to CSV format

* Update

* Update

* Update

* Update

* Update

* reset args

* reset args

* reset args

* plot_results() fix

* Cleanup

* Cleanup2
2021-08-04 17:13:38 +02:00
junji hashimoto
2d99063201
Feature python train.py --cache disk (#4049)
* Add cache-on-disk and cache-directory to cache images on disk

* Fix load_image with cache_on_disk

* Add no_cache flag for load_image

* Revert the parts('logging' and a new line) that do not need to be modified

* Add the assertion for shapes of cached images

* Add a suffix string for cached images

* Fix boundary-error of letterbox for load_mosaic

* Add prefix as cache-key of cache-on-disk

* Update cache-function on disk

* Add psutil in requirements.txt

* Update train.py

* Cleanup1

* Cleanup2

* Skip existing npy

* Include re-space

* Export return character fix

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-02 18:47:24 +02:00
Kalen Michael
b74929c910
Add train.py and val.py callbacks (#4220)
* added callbacks

* Update callbacks.py

* Update train.py

* Update val.py

* Fix CamlCase add staticmethod

* Refactor logger into callbacks

* Cleanup

* New callback on_val_image_end()

* Add curves and results images to TensorBoard

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-01 00:18:07 +02:00
IneovaAI
bceb57b910
Add python train.py --freeze N argument (#4238)
* Add freeze as an argument

I train on different platforms and sometimes I want to freeze some layers. I have to go into the code and change it and also keep track of how many layers I froze on what platform. Please add the number of layers to freeze as an argument in future versions thanks.

* Update train.py

* Update train.py

* Cleanup

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-30 17:39:48 +02:00
Glenn Jocher
8d3c3ef45c
Fix weight decay comment (#4228) 2021-07-30 01:35:39 +02:00
Glenn Jocher
c2c958c350
Explicit requirements.txt location (#4225) 2021-07-29 17:29:39 +02:00
Glenn Jocher
b60b62e874
PyCharm reformat (#4209)
* PyCharm reformat

* YAML reformat

* Markdown reformat
2021-07-28 23:35:14 +02:00
Ayush Chaurasia
e88e8f7a98
W&B: Restructure code to support the new dataset_check() feature (#4197)
* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-28 17:40:08 +02:00
Glenn Jocher
5d66e48723
Train from --data path/to/dataset.zip feature (#4185)
* Train from `--data path/to/dataset.zip` feature

* Update dataset_stats()

* cleanup

* cleanup2
2021-07-28 02:04:10 +02:00
Glenn Jocher
0ad6301c96
Update script headers (#4163)
* Update download script headers

* cleanup

* bug fix attempt

* bug fix attempt2

* bug fix attempt3

* cleanup
2021-07-26 15:23:33 +02:00
Glenn Jocher
96e36a7c91
New CSV Logger (#4148)
* New CSV Logger

* cleanup

* move batch plots into Logger

* rename comment

* Remove total loss from progress bar

* mloss :-1 bug fix

* Update plot_results()

* Update plot_results()

* plot_results bug fix
2021-07-25 19:06:37 +02:00
Glenn Jocher
efe60b5681
Refactor train.py and val.py loggers (#4137)
* Update loggers

* Config

* Update val.py

* cleanup

* fix1

* fix2

* fix3 and reformat

* format sweep.py

* Logger() class

* cleanup

* cleanup2

* wandb package import fix

* wandb package import fix2

* txt fix

* fix4

* fix5

* fix6

* drop wandb into utils/loggers

* fix 7

* rename loggers/wandb_logging to loggers/wandb

* Update message

* Update message

* Update message

* cleanup

* Fix x axis bug

* fix rank 0 issue

* cleanup
2021-07-25 01:18:39 +02:00
Glenn Jocher
63dd65e7ed
Update train.py (#4136)
* Refactor train.py

* Update imports

* Update imports

* Update optimizer

* cleanup
2021-07-24 16:11:39 +02:00
Glenn Jocher
2c073cd207
Add train.py `--img-size floor (#4099) 2021-07-21 16:50:47 +02:00
Glenn Jocher
f7d8562060
val.py refactor (#4053)
* val.py refactor

* cleanup

* cleanup

* cleanup

* cleanup

* save after eval

* opt.imgsz bug fix

* wandb refactor

* dataloader to train_loader

* capitalize global variables

* runs/hub/exp to runs/detect/exp

* refactor wandb logging

* Refactor wandb operations (#4061)

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2021-07-19 10:43:01 +02:00
Glenn Jocher
951922c735
Add --sync-bn known issue (#4032)
* Add `--sync-bn` known issue

* Update train.py
2021-07-17 13:07:19 +02:00
Glenn Jocher
720aaa65c8
Rename test.py to val.py (#4000) 2021-07-14 15:43:54 +02:00
Eldar Kurtic
e7888af94c
Fix inconsistent NMS IoU value for COCO (#3934)
Evaluation of 'best' and 'last' models will use the same params as the evaluation during the training phase. 
This PR fixes https://github.com/ultralytics/yolov5/issues/3907
2021-07-08 15:29:02 +02:00
Glenn Jocher
8930e22cce
Evolution commented hyp['anchors'] fix (#3887)
Fix for `KeyError: 'anchors'` error when start hyperparameter evolution:
```bash
python train.py --evolve
```

```bash
Traceback (most recent call last):
  File "E:\yolov5\train.py", line 623, in <module>
    hyp[k] = max(hyp[k], v[1])  # lower limit
KeyError: 'anchors'
```
2021-07-05 12:48:27 +02:00
san-soucie
d3e9d69850
--evolve 300 generations CLI argument (#3863)
* evolve command accepts argument for number of generations

* evolve generations argument used in evolve for loop

* evolve argument boolean fixes

* default to 300 evolve generations

* Update train.py

Co-authored-by: John San Soucie <jsansoucie@whoi.edu>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-04 12:14:35 +02:00
Glenn Jocher
c6c88dc601
Copy-Paste augmentation for YOLOv5 (#3845)
* Copy-paste augmentation initial commit

* if any segments

* Add obscuration rejection

* Add copy_paste hyperparameter

* Update comments
2021-07-01 00:35:04 +02:00
yellowdolphin
3974d725b6
Fix warmup accumulate (#3722)
* gradient accumulation during warmup in train.py

Context:
`accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
During warmup, it is ramped up from 1 to the final value nbs / batch_size. 
Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.

The bug:
The condition to perform the opt step is wrong
> if ni % accumulate == 0:
This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.

This demo also shows the proposed solution, to use a ">=" condition instead:
https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing

Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.

* replace last_opt_step tracking by do_step(ni)

* add docstrings

* move down nw

* Update train.py

* revert math import move

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-28 12:25:13 +02:00
Glenn Jocher
92d49fde35
Update seeds for single-GPU reproducibility (#3789)
For seed=0 on single-GPU.
2021-06-26 15:42:40 +02:00
Piotr Skalski
09246a5a33
fix/incorrect_fitness_import (#3770) 2021-06-25 16:16:18 +02:00
Glenn Jocher
f2d97ebb25
Remove DDP MultiHeadAttention fix (#3768) 2021-06-25 12:52:05 +02:00
Glenn Jocher
f79d7479da
Add optional dataset.yaml path attribute (#3753)
* Add optional dataset.yaml `path` attribute

@KalenMike

* pass locals to python scripts

* handle lists

* update coco128.yaml

* Capitalize first letter

* add test key

* finalize GlobalWheat2020.yaml

* finalize objects365.yaml

* finalize SKU-110K.yaml

* finalize SKU-110K.yaml

* finalize VisDrone.yaml

* NoneType fix

* update download comment

* voc to VOC

* update

* update VOC.yaml

* update VOC.yaml

* remove dashes

* delete get_voc.sh

* force coco and coco128 to ../datasets

* Capitalize Argoverse_HD.yaml

* Capitalize Objects365.yaml

* update Argoverse_HD.yaml

* coco segments fix

* VOC single-thread

* update Argoverse_HD.yaml

* update data_dict in test handling

* create root
2021-06-25 01:25:03 +02:00
Glenn Jocher
ae4261c774
Force non-zero hyp evolution weights w (#3748)
Fix for https://github.com/ultralytics/yolov5/issues/3741
2021-06-23 12:56:22 +02:00
Glenn Jocher
fdc22398fa
Create data/hyps directory (#3747) 2021-06-23 12:49:38 +02:00
Glenn Jocher
1f69d12591
Update 4 main ops for paths and .run() (#3715)
* Add yolov5/ to path

* rename functions to run()

* cleanup

* rename fix

* CI fix

* cleanup find models/export.py
2021-06-21 17:25:04 +02:00
Ayush Chaurasia
75c0ff43af
[x]W&B: Don't resume transfer learning runs (#3604)
* Allow config cahnge

* Allow val change in wandb config

* Don't resume transfer learning runs

* Add entity in log dataset
2021-06-21 14:00:25 +02:00
Glenn Jocher
e8810a53e8
Update DDP backend if dist.is_nccl_available() (#3705) 2021-06-20 17:15:42 +02:00
Glenn Jocher
fbf41e0913
Add train.run() method (#3700)
* Update train.py explicit arguments

* Update train.py

* Add run method
2021-06-20 15:06:58 +02:00
Glenn Jocher
c1af67dcd4
Add torch DP warning (#3698) 2021-06-19 19:50:46 +02:00
Glenn Jocher
b3e2f4e08d
Eliminate total_batch_size variable (#3697)
* Eliminate `total_batch_size` variable

* cleanup

* Update train.py
2021-06-19 19:14:59 +02:00
Glenn Jocher
fad27c0046
Update DDP for torch.distributed.run with gloo backend (#3680)
* Update DDP for `torch.distributed.run`

* Add LOCAL_RANK

* remove opt.local_rank

* backend="gloo|nccl"

* print

* print

* debug

* debug

* os.getenv

* gloo

* gloo

* gloo

* cleanup

* fix getenv

* cleanup

* cleanup destroy

* try nccl

* return opt

* add --local_rank

* add timeout

* add init_method

* gloo

* move destroy

* move destroy

* move print(opt) under if RANK

* destroy only RANK 0

* move destroy inside train()

* restore destroy outside train()

* update print(opt)

* cleanup

* nccl

* gloo with 60 second timeout

* update namespace printing
2021-06-19 16:30:25 +02:00
lb-desupervised
bfb2276b1d
Slightly modify CLI execution (#3687)
* Slightly modify CLI execution

This simple change makes it easier to run the primary functions of this
repo (train/detect/test) from within Python. An object which represents
`opt` can be constructed and fed to the `main` function of each of these
modules, rather than having to call the lower level functions directly,
or run the module as a script.

* Update export.py

Add CLI parsing update for more convenient module usage within Python.

Co-authored-by: Lewis Belcher <lb@desupervised.io>
2021-06-19 12:06:59 +02:00
Glenn Jocher
2296f1546f
Update WORLD_SIZE and RANK retrieval (#3670) 2021-06-17 23:24:30 +02:00
Glenn Jocher
045d5d8629
Update TensorBoard (#3669) 2021-06-17 22:12:42 +02:00
Glenn Jocher
fa201f968e
Update train(hyp, *args) to accept hyp file or dict (#3668) 2021-06-17 22:03:25 +02:00
Glenn Jocher
6d6e2ca65f
Update train.py (#3667) 2021-06-17 21:32:39 +02:00
Wei Quan
4c5d9bff80
Fix incorrect end epoch comment (#3612) 2021-06-15 11:24:56 +02:00
Glenn Jocher
4984cf54be
train.py GPU memory fix (#3590)
* train.py GPU memory fix

* ema

* cuda

* cuda

* zeros input

* to device

* batch index 0
2021-06-11 20:24:03 +02:00
Glenn Jocher
4695ca8314
Refactoring cleanup (#3565)
* Refactoring cleanup

* Update test.py
2021-06-09 22:50:27 +02:00
Glenn Jocher
5948f20a3d
Update test.py profiling (#3555)
* Update test.py profiling

* half_precision to half

* inplace
2021-06-09 16:25:17 +02:00
Glenn Jocher
63157d214d
Remove is_coco argument from test() (#3553) 2021-06-09 15:09:51 +02:00
Glenn Jocher
958ab92dc1
Remove opt from create_dataloader()` (#3552) 2021-06-09 13:14:56 +02:00