183 Commits

Author SHA1 Message Date
Zirunis
4ed93fce93
Fix LR scheduler help in train.py
The default is and always has been the cosine scheduler, yet the help states that the default would be the step scheduler. Whatever the intended one was, for backwards compatibility the default should definitely remain cosine, which is why I changed the help comment to reflect that.
2024-07-22 23:04:00 +02:00
Tianyi Wang
d3ce5a8665
Avoid zero division error 2024-07-15 12:45:46 +10:00
Ross Wightman
e25bbfceec Fix #2097 a small typo in train.py 2024-04-10 09:40:14 -07:00
Ross Wightman
5a58f4d3dc Remove test MESA support, no signal that it's helpful so far 2024-02-10 14:38:01 -08:00
Ross Wightman
c7ac37693d Add device arg to validate() calls in train.py 2024-02-04 10:14:57 -08:00
Ross Wightman
bee0471f91 forward() pass through for ema model, flag for ema warmup, comment about warmup 2024-02-03 16:24:45 -08:00
Ross Wightman
5e4a4b2adc Merge branch 'device_flex' into mesa_ema 2024-02-02 09:45:30 -08:00
Ross Wightman
dd84ef2cd5 ModelEmaV3 and MESA experiments 2024-02-02 09:45:04 -08:00
Ross Wightman
809a9e14e2 Pass train-crop-mode to create_loader/transforms from train.py args 2024-01-24 16:19:02 -08:00
Ross Wightman
a48ab818f5 Improving device flexibility in train. Fix #2081 2024-01-20 15:10:20 -08:00
lorenzbaraldi
8c663c4b86 Fixed index out of range in case of resume 2024-01-12 23:33:32 -08:00
Ross Wightman
c50004db79 Allow training w/o validation split set 2024-01-08 09:38:42 -08:00
Ross Wightman
be0944edae Significant transforms, dataset, dataloading enhancements. 2024-01-08 09:38:42 -08:00
Ross Wightman
b5a4fa9c3b Add pos_weight and support for summing over classes to BCE impl in train scripts 2023-12-30 12:13:06 -08:00
Ross Wightman
f2fdd97e9f Add parsable json results output for train.py, tweak --pretrained-path to force head adaptation 2023-12-22 11:18:25 -08:00
Ross Wightman
60b170b200 Add --pretrained-path arg to train script to allow passing local checkpoint as pretrained. Add missing/unexpected keys log. 2023-12-11 12:10:29 -08:00
Ross Wightman
a83e9f2d3b forward & backward in same no_sync context, slightly easier to read that splitting 2023-04-20 08:14:05 -07:00
Ross Wightman
4cd7fb88b2 clip gradients with update 2023-04-19 23:36:20 -07:00
Ross Wightman
df81d8d85b Cleanup gradient accumulation, fix a few issues, a few other small cleanups in related code. 2023-04-19 23:11:00 -07:00
Ross Wightman
ab7ca62a6e Merge branch 'main' of github.com:rwightman/pytorch-image-models into wip-voidbag-accumulate-grad 2023-04-19 11:08:12 -07:00
Ross Wightman
ec6cca4b37 Add head-init-scale and head-init-bias args that works for all models, fix #1718 2023-04-14 17:59:23 -07:00
Ross Wightman
43e6143bef Fix #1712 broken support for AMP w/ PyTorch < 1.10. Disable loss scaler for bfloat16 2023-03-11 15:26:09 -08:00
Taeksang Kim
7f29a46d44 Add gradient accumulation option to train.py
option: iters-to-accum(iterations to accmulate)

Gradient accumulation improves training performance(samples/s).
It can reduce the number of parameter sharing between each node.
This option can be helpful when network is bottleneck.

Signed-off-by: Taeksang Kim <voidbag@puzzle-ai.com>
2023-02-06 09:24:48 +09:00
Fredo Guan
81ca323751
Davit update formatting and fix grad checkpointing (#7)
fixed head to gap->norm->fc as per convnext, along with option for norm->gap->fc
failed tests due to clip convnext models, davit tests passed
2023-01-15 14:34:56 -08:00
Ross Wightman
d5e7d6b27e Merge remote-tracking branch 'origin/main' into refactor-imports 2022-12-09 14:49:44 -08:00
Lorenzo Baraldi
3d6bc42aa1 Put validation loss under amp_autocast
Secured the loss evaluation under the amp, avoiding function to operate on float16
2022-12-09 12:03:23 +01:00
Ross Wightman
927f031293 Major module / path restructure, timm.models.layers -> timm.layers, add _ prefix to all non model modules in timm.models 2022-12-06 15:00:06 -08:00
Ross Wightman
dbe7531aa3 Update scripts to support torch.compile(). Make --results_file arg more consistent across benchmark/validate/inference. Fix #1570 2022-12-05 10:21:34 -08:00
Ross Wightman
9da7e3a799 Add crop_mode for pretraind config / image transforms. Add support for dynamo compilation to benchmark/train/validate 2022-12-05 10:21:34 -08:00
Ross Wightman
4714a4910e
Merge pull request #1525 from TianyiFranklinWang/main
✏️ fix typo
2022-11-03 20:55:43 -07:00
klae01
ddd6361904
Update train.py
fix typo args.in_chanes
2022-11-01 16:55:05 +09:00
NPU-Franklin
9152b10478
✏️ fix typo 2022-10-30 08:49:40 +08:00
hova88
29baf32327 fix typo : miss back quote 2022-10-28 09:30:51 +08:00
Simon Schrodi
aceb79e002 Fix typo 2022-10-17 22:06:17 +02:00
Ross Wightman
285771972e Change --amp flags, no more --apex-amp and --native-amp, add --amp-impl to select apex, and --amp-dtype to allow bfloat16 AMP dtype 2022-10-07 15:27:25 -07:00
Ross Wightman
b1b024dfed Scheduler update, add v2 factory method, support scheduling on updates instead of just epochs. Add LR to summary csv. Add lr_base scaling calculations to train script. Fix #1168 2022-10-07 10:43:04 -07:00
Ross Wightman
b8c8550841 Data improvements. Improve train support for in_chans != 3. Add wds dataset support from bits_and_tpu branch w/ fixes and tweaks. TFDS tweaks. 2022-09-29 16:42:58 -07:00
Ross Wightman
87939e6fab Refactor device handling in scripts, distributed init to be less 'cuda' centric. More device args passed through where needed. 2022-09-23 16:08:59 -07:00
Ross Wightman
ff6a919cf5 Add --fast-norm arg to benchmark.py, train.py, validate.py 2022-08-25 17:20:46 -07:00
Xiao Wang
11060f84c5 make train.py compatible with torchrun 2022-07-07 14:44:55 -07:00
Ross Wightman
a29fba307d disable dist_bn when sync_bn active 2022-06-24 21:30:17 -07:00
Ross Wightman
879df47c0a Support BatchNormAct2d for sync-bn use. Fix #1254 2022-06-24 14:51:26 -07:00
Ross Wightman
037e5e6c09 Fix #1309, move wandb init after distributed init, only init on rank == 0 process 2022-06-21 12:32:40 -07:00
Jakub Kaczmarzyk
9e12530433 use utils namespace instead of function/classnames
This fixes buggy behavior introduced by
https://github.com/rwightman/pytorch-image-models/pull/1266.

Related to https://github.com/rwightman/pytorch-image-models/pull/1273.
2022-06-12 22:39:41 -07:00
Xiao Wang
ca991c1fa5 add --aot-autograd 2022-06-07 18:01:52 -07:00
Ross Wightman
fd360ac951
Merge pull request #1266 from kaczmarj/enh/no-star-imports
ENH: replace star imports with imported names in train.py
2022-05-20 08:55:07 -07:00
Jakub Kaczmarzyk
ce5578bc3a replace star imports with imported names 2022-05-18 11:04:10 -04:00
Jakub Kaczmarzyk
dcad288fd6 use argparse groups to group arguments 2022-05-18 10:27:33 -04:00
Jakub Kaczmarzyk
e1e4c9bbae rm whitespace 2022-05-18 10:17:02 -04:00
han
a16171335b fix: change milestones to decay-milestones
- change argparser option `milestone` to `decay-milestone`
2022-05-10 07:57:19 +09:00