273 Commits

Author SHA1 Message Date
Kalen Michael
b74929c910
Add train.py and val.py callbacks (#4220)
* added callbacks

* Update callbacks.py

* Update train.py

* Update val.py

* Fix CamlCase add staticmethod

* Refactor logger into callbacks

* Cleanup

* New callback on_val_image_end()

* Add curves and results images to TensorBoard

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-08-01 00:18:07 +02:00
IneovaAI
bceb57b910
Add python train.py --freeze N argument (#4238)
* Add freeze as an argument

I train on different platforms and sometimes I want to freeze some layers. I have to go into the code and change it and also keep track of how many layers I froze on what platform. Please add the number of layers to freeze as an argument in future versions thanks.

* Update train.py

* Update train.py

* Cleanup

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-30 17:39:48 +02:00
Glenn Jocher
8d3c3ef45c
Fix weight decay comment (#4228) 2021-07-30 01:35:39 +02:00
Glenn Jocher
c2c958c350
Explicit requirements.txt location (#4225) 2021-07-29 17:29:39 +02:00
Glenn Jocher
b60b62e874
PyCharm reformat (#4209)
* PyCharm reformat

* YAML reformat

* Markdown reformat
2021-07-28 23:35:14 +02:00
Ayush Chaurasia
e88e8f7a98
W&B: Restructure code to support the new dataset_check() feature (#4197)
* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-28 17:40:08 +02:00
Glenn Jocher
5d66e48723
Train from --data path/to/dataset.zip feature (#4185)
* Train from `--data path/to/dataset.zip` feature

* Update dataset_stats()

* cleanup

* cleanup2
2021-07-28 02:04:10 +02:00
Glenn Jocher
0ad6301c96
Update script headers (#4163)
* Update download script headers

* cleanup

* bug fix attempt

* bug fix attempt2

* bug fix attempt3

* cleanup
2021-07-26 15:23:33 +02:00
Glenn Jocher
96e36a7c91
New CSV Logger (#4148)
* New CSV Logger

* cleanup

* move batch plots into Logger

* rename comment

* Remove total loss from progress bar

* mloss :-1 bug fix

* Update plot_results()

* Update plot_results()

* plot_results bug fix
2021-07-25 19:06:37 +02:00
Glenn Jocher
efe60b5681
Refactor train.py and val.py loggers (#4137)
* Update loggers

* Config

* Update val.py

* cleanup

* fix1

* fix2

* fix3 and reformat

* format sweep.py

* Logger() class

* cleanup

* cleanup2

* wandb package import fix

* wandb package import fix2

* txt fix

* fix4

* fix5

* fix6

* drop wandb into utils/loggers

* fix 7

* rename loggers/wandb_logging to loggers/wandb

* Update message

* Update message

* Update message

* cleanup

* Fix x axis bug

* fix rank 0 issue

* cleanup
2021-07-25 01:18:39 +02:00
Glenn Jocher
63dd65e7ed
Update train.py (#4136)
* Refactor train.py

* Update imports

* Update imports

* Update optimizer

* cleanup
2021-07-24 16:11:39 +02:00
Glenn Jocher
2c073cd207
Add train.py `--img-size floor (#4099) 2021-07-21 16:50:47 +02:00
Glenn Jocher
f7d8562060
val.py refactor (#4053)
* val.py refactor

* cleanup

* cleanup

* cleanup

* cleanup

* save after eval

* opt.imgsz bug fix

* wandb refactor

* dataloader to train_loader

* capitalize global variables

* runs/hub/exp to runs/detect/exp

* refactor wandb logging

* Refactor wandb operations (#4061)

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2021-07-19 10:43:01 +02:00
Glenn Jocher
951922c735
Add --sync-bn known issue (#4032)
* Add `--sync-bn` known issue

* Update train.py
2021-07-17 13:07:19 +02:00
Glenn Jocher
720aaa65c8
Rename test.py to val.py (#4000) 2021-07-14 15:43:54 +02:00
Eldar Kurtic
e7888af94c
Fix inconsistent NMS IoU value for COCO (#3934)
Evaluation of 'best' and 'last' models will use the same params as the evaluation during the training phase. 
This PR fixes https://github.com/ultralytics/yolov5/issues/3907
2021-07-08 15:29:02 +02:00
Glenn Jocher
8930e22cce
Evolution commented hyp['anchors'] fix (#3887)
Fix for `KeyError: 'anchors'` error when start hyperparameter evolution:
```bash
python train.py --evolve
```

```bash
Traceback (most recent call last):
  File "E:\yolov5\train.py", line 623, in <module>
    hyp[k] = max(hyp[k], v[1])  # lower limit
KeyError: 'anchors'
```
2021-07-05 12:48:27 +02:00
san-soucie
d3e9d69850
--evolve 300 generations CLI argument (#3863)
* evolve command accepts argument for number of generations

* evolve generations argument used in evolve for loop

* evolve argument boolean fixes

* default to 300 evolve generations

* Update train.py

Co-authored-by: John San Soucie <jsansoucie@whoi.edu>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-07-04 12:14:35 +02:00
Glenn Jocher
c6c88dc601
Copy-Paste augmentation for YOLOv5 (#3845)
* Copy-paste augmentation initial commit

* if any segments

* Add obscuration rejection

* Add copy_paste hyperparameter

* Update comments
2021-07-01 00:35:04 +02:00
yellowdolphin
3974d725b6
Fix warmup accumulate (#3722)
* gradient accumulation during warmup in train.py

Context:
`accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
During warmup, it is ramped up from 1 to the final value nbs / batch_size. 
Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.

The bug:
The condition to perform the opt step is wrong
> if ni % accumulate == 0:
This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.

This demo also shows the proposed solution, to use a ">=" condition instead:
https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing

Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.

* replace last_opt_step tracking by do_step(ni)

* add docstrings

* move down nw

* Update train.py

* revert math import move

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2021-06-28 12:25:13 +02:00
Glenn Jocher
92d49fde35
Update seeds for single-GPU reproducibility (#3789)
For seed=0 on single-GPU.
2021-06-26 15:42:40 +02:00
Piotr Skalski
09246a5a33
fix/incorrect_fitness_import (#3770) 2021-06-25 16:16:18 +02:00
Glenn Jocher
f2d97ebb25
Remove DDP MultiHeadAttention fix (#3768) 2021-06-25 12:52:05 +02:00
Glenn Jocher
f79d7479da
Add optional dataset.yaml path attribute (#3753)
* Add optional dataset.yaml `path` attribute

@KalenMike

* pass locals to python scripts

* handle lists

* update coco128.yaml

* Capitalize first letter

* add test key

* finalize GlobalWheat2020.yaml

* finalize objects365.yaml

* finalize SKU-110K.yaml

* finalize SKU-110K.yaml

* finalize VisDrone.yaml

* NoneType fix

* update download comment

* voc to VOC

* update

* update VOC.yaml

* update VOC.yaml

* remove dashes

* delete get_voc.sh

* force coco and coco128 to ../datasets

* Capitalize Argoverse_HD.yaml

* Capitalize Objects365.yaml

* update Argoverse_HD.yaml

* coco segments fix

* VOC single-thread

* update Argoverse_HD.yaml

* update data_dict in test handling

* create root
2021-06-25 01:25:03 +02:00
Glenn Jocher
ae4261c774
Force non-zero hyp evolution weights w (#3748)
Fix for https://github.com/ultralytics/yolov5/issues/3741
2021-06-23 12:56:22 +02:00
Glenn Jocher
fdc22398fa
Create data/hyps directory (#3747) 2021-06-23 12:49:38 +02:00
Glenn Jocher
1f69d12591
Update 4 main ops for paths and .run() (#3715)
* Add yolov5/ to path

* rename functions to run()

* cleanup

* rename fix

* CI fix

* cleanup find models/export.py
2021-06-21 17:25:04 +02:00
Ayush Chaurasia
75c0ff43af
[x]W&B: Don't resume transfer learning runs (#3604)
* Allow config cahnge

* Allow val change in wandb config

* Don't resume transfer learning runs

* Add entity in log dataset
2021-06-21 14:00:25 +02:00
Glenn Jocher
e8810a53e8
Update DDP backend if dist.is_nccl_available() (#3705) 2021-06-20 17:15:42 +02:00
Glenn Jocher
fbf41e0913
Add train.run() method (#3700)
* Update train.py explicit arguments

* Update train.py

* Add run method
2021-06-20 15:06:58 +02:00
Glenn Jocher
c1af67dcd4
Add torch DP warning (#3698) 2021-06-19 19:50:46 +02:00
Glenn Jocher
b3e2f4e08d
Eliminate total_batch_size variable (#3697)
* Eliminate `total_batch_size` variable

* cleanup

* Update train.py
2021-06-19 19:14:59 +02:00
Glenn Jocher
fad27c0046
Update DDP for torch.distributed.run with gloo backend (#3680)
* Update DDP for `torch.distributed.run`

* Add LOCAL_RANK

* remove opt.local_rank

* backend="gloo|nccl"

* print

* print

* debug

* debug

* os.getenv

* gloo

* gloo

* gloo

* cleanup

* fix getenv

* cleanup

* cleanup destroy

* try nccl

* return opt

* add --local_rank

* add timeout

* add init_method

* gloo

* move destroy

* move destroy

* move print(opt) under if RANK

* destroy only RANK 0

* move destroy inside train()

* restore destroy outside train()

* update print(opt)

* cleanup

* nccl

* gloo with 60 second timeout

* update namespace printing
2021-06-19 16:30:25 +02:00
lb-desupervised
bfb2276b1d
Slightly modify CLI execution (#3687)
* Slightly modify CLI execution

This simple change makes it easier to run the primary functions of this
repo (train/detect/test) from within Python. An object which represents
`opt` can be constructed and fed to the `main` function of each of these
modules, rather than having to call the lower level functions directly,
or run the module as a script.

* Update export.py

Add CLI parsing update for more convenient module usage within Python.

Co-authored-by: Lewis Belcher <lb@desupervised.io>
2021-06-19 12:06:59 +02:00
Glenn Jocher
2296f1546f
Update WORLD_SIZE and RANK retrieval (#3670) 2021-06-17 23:24:30 +02:00
Glenn Jocher
045d5d8629
Update TensorBoard (#3669) 2021-06-17 22:12:42 +02:00
Glenn Jocher
fa201f968e
Update train(hyp, *args) to accept hyp file or dict (#3668) 2021-06-17 22:03:25 +02:00
Glenn Jocher
6d6e2ca65f
Update train.py (#3667) 2021-06-17 21:32:39 +02:00
Wei Quan
4c5d9bff80
Fix incorrect end epoch comment (#3612) 2021-06-15 11:24:56 +02:00
Glenn Jocher
4984cf54be
train.py GPU memory fix (#3590)
* train.py GPU memory fix

* ema

* cuda

* cuda

* zeros input

* to device

* batch index 0
2021-06-11 20:24:03 +02:00
Glenn Jocher
4695ca8314
Refactoring cleanup (#3565)
* Refactoring cleanup

* Update test.py
2021-06-09 22:50:27 +02:00
Glenn Jocher
5948f20a3d
Update test.py profiling (#3555)
* Update test.py profiling

* half_precision to half

* inplace
2021-06-09 16:25:17 +02:00
Glenn Jocher
63157d214d
Remove is_coco argument from test() (#3553) 2021-06-09 15:09:51 +02:00
Glenn Jocher
958ab92dc1
Remove opt from create_dataloader()` (#3552) 2021-06-09 13:14:56 +02:00
Glenn Jocher
ef0b5c9d29
On-demand pycocotools pip install (#3547) 2021-06-09 11:22:21 +02:00
Glenn Jocher
f3c3d2ce5d
Merge develop branch into master (#3518)
* update ci-testing.yml (#3322)

* update ci-testing.yml

* update greetings.yml

* bring back os matrix

* update ci-testing.yml (#3322)

* update ci-testing.yml

* update greetings.yml

* bring back os matrix

* Enable direct `--weights URL` definition (#3373)

* Enable direct `--weights URL` definition

@KalenMike this PR will enable direct --weights URL definition. Example use case:
```
python train.py --weights https://storage.googleapis.com/bucket/dir/model.pt
```

* cleanup

* bug fixes

* weights = attempt_download(weights)

* Update experimental.py

* Update hubconf.py

* return bug fix

* comment mirror

* min_bytes

* Update tutorial.ipynb (#3368)

add Open in Kaggle badge

* `cv2.imread(img, -1)` for IMREAD_UNCHANGED (#3379)

* Update datasets.py

* comment

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

* COCO evolution fix (#3388)

* COCO evolution fix

* cleanup

* update print

* print fix

* Create `is_pip()` function (#3391)

Returns `True` if file is part of pip package. Useful for contextual behavior modification.

```python
def is_pip():
    # Is file in a pip package?
    return 'site-packages' in Path(__file__).absolute().parts
```

* Revert "`cv2.imread(img, -1)` for IMREAD_UNCHANGED (#3379)" (#3395)

This reverts commit 21a9607e00f1365b21d8c4bd81bdbf5fc0efea24.

* Update FLOPs description (#3422)

* Update README.md

* Changing FLOPS to FLOPs.

Co-authored-by: BuildTools <unconfigured@null.spigotmc.org>

* Parse URL authentication (#3424)

* Parse URL authentication

* urllib.parse.unquote()

* improved error handling

* improved error handling

* remove %3F

* update check_file()

* Add FLOPs title to table (#3453)

* Suppress jit trace warning + graph once (#3454)

* Suppress jit trace warning + graph once

Suppress harmless jit trace warning on TensorBoard add_graph call. Also fix multiple add_graph() calls bug, now only on batch 0.

* Update train.py

* Update MixUp augmentation `alpha=beta=32.0` (#3455)

Per VOC empirical results https://github.com/ultralytics/yolov5/issues/3380#issuecomment-853001307 by @developer0hye

* Add `timeout()` class (#3460)

* Add `timeout()` class

* rearrange order

* Faster HSV augmentation (#3462)

remove datatype conversion process that can be skipped

* Add `check_git_status()` 5 second timeout (#3464)

* Add check_git_status() 5 second timeout

This should prevent the SSH Git bug that we were discussing @KalenMike

* cleanup

* replace timeout with check_output built-in timeout

* Improved `check_requirements()` offline-handling (#3466)

Improve robustness of `check_requirements()` function to offline environments (do not attempt pip installs when offline).

* Add `output_names` argument for ONNX export with dynamic axes (#3456)

* Add output names & dynamic axes for onnx export

Add output_names and dynamic_axes names for all outputs in torch.onnx.export. The first four outputs of the model will have names output0, output1, output2, output3

* use first output only + cleanup

Co-authored-by: Samridha Shrestha <samridha.shrestha@g42.ai>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

* Revert FP16 `test.py` and `detect.py` inference to FP32 default (#3423)

* fixed inference bug ,while use half precision

* replace --use-half with --half

* replace space and PEP8 in detect.py

* PEP8 detect.py

* update --half help comment

* Update test.py

* revert space

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

* Add additional links/resources to stale.yml message (#3467)

* Update stale.yml

* cleanup

* Update stale.yml

* reformat

* Update stale.yml HUB URL (#3468)

* Stale `github.actor` bug fix (#3483)

* Explicit `model.eval()` call `if opt.train=False` (#3475)

* call model.eval() when opt.train is False

call model.eval() when opt.train is False

* single-line if statement

* cleanup

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

* check_requirements() exclude `opencv-python` (#3495)

Fix for 3rd party or contrib versions of installed OpenCV as in https://github.com/ultralytics/yolov5/issues/3494.

* Earlier `assert` for cpu and half option (#3508)

* early assert for cpu and half option

early assert for cpu and half option

* Modified comment

Modified comment

* Update tutorial.ipynb (#3510)

* Reduce test.py results spacing (#3511)

* Update README.md (#3512)

* Update README.md

Minor modifications

* 850 width

* Update greetings.yml

revert greeting change as PRs will now merge to master.

Co-authored-by: Piotr Skalski <SkalskiP@users.noreply.github.com>
Co-authored-by: SkalskiP <piotr.skalski92@gmail.com>
Co-authored-by: Peretz Cohen <pizzaz93@users.noreply.github.com>
Co-authored-by: tudoulei <34886368+tudoulei@users.noreply.github.com>
Co-authored-by: chocosaj <chocosaj@users.noreply.github.com>
Co-authored-by: BuildTools <unconfigured@null.spigotmc.org>
Co-authored-by: Yonghye Kwon <developer.0hye@gmail.com>
Co-authored-by: Sam_S <SamSamhuns@users.noreply.github.com>
Co-authored-by: Samridha Shrestha <samridha.shrestha@g42.ai>
Co-authored-by: edificewang <609552430@qq.com>
2021-06-08 10:22:10 +02:00
Glenn Jocher
aad99b63d6
TensorBoard DP/DDP graph fix (#3325) 2021-05-25 11:45:24 +02:00
Charles Frye
19100ba007
Improves docs and handling of entities and resuming by WandbLogger (#3264)
* adds latest tag to match wandb defaults

* adds entity handling, 'last' tag

* fixes bug causing finished runs to resume

* removes redundant "last" tag for wandb artifact
2021-05-21 23:42:53 +02:00
Glenn Jocher
dd7f0b7e05
Fix TypeError: 'PosixPath' object is not iterable (#3285) 2021-05-21 23:35:31 +02:00
Glenn Jocher
10d56d784e
Assert --image-weights not combined with DDP (#3275) 2021-05-21 14:46:42 +02:00