Ross Wightman
|
afdf11d9ae
|
Add caution to Adan. Add decouple decay option to LAMB.
|
2024-12-05 13:50:30 -08:00 |
Ross Wightman
|
303f7691a1
|
Add cautious mars, improve test reliability by skipping grad diff for first step
|
2024-12-02 11:29:02 -08:00 |
Ross Wightman
|
82e8677690
|
Make LaProp weight decay match typical PyTorch 'decoupled' behaviour where it's scaled by LR
|
2024-11-29 16:44:43 -08:00 |
Ross Wightman
|
886eb77938
|
Update README, missed small discrep in adafactor min dim update
|
2024-11-29 10:57:47 -08:00 |
Ross Wightman
|
e3e434bbc4
|
To be technically correct, need to check the in-place _ ver of op
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
7c32d3bd82
|
Work around _foreach_maximum issue, need scalar other support
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
7cf683628f
|
Cautious optimizer impl plus some typing cleanup.
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
4f64ec4e14
|
Add guard around 'somewhat' newer torch RAdam / NAdam imports
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
1ab02a11a1
|
Update Adan with newer impl (from original source) that includes multi-tensor fn
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
a024ab3170
|
Replace radam & nadam impl with torch.optim ver, rename legacy adamw, nadam, radam impl in timm. Update optim factory & tests.
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
7b54eab807
|
Add MARS and LaProp impl, simplified from originals
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
e5aea357b1
|
Update Adopt to include clipping for stability, separate wd so no param decay if update not taken on first step
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
e35ea733ab
|
Fix compiler check for adopt so it doesn't fail for torch >= 2 but less than recent with .is_compiling()
|
2024-11-13 11:24:01 -08:00 |
Ross Wightman
|
0b5264a108
|
Missing optimizers in __init__.py, add bind_defaults=False for unit tests
|
2024-11-13 10:50:46 -08:00 |
Ross Wightman
|
d0161f303a
|
Small optim factory tweak. default bind_defaults=True for get_optimizer_class
|
2024-11-13 10:45:48 -08:00 |
Ross Wightman
|
8b9b6824ae
|
Minor changes, has_eps=False missing for bnb lion
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
61305cc26a
|
Fix adopt descriptions
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
dde990785e
|
More fixes for new factory & tests, add back adahessian
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
45490ac52f
|
Post merge fix reference of old param groups helper fn locations
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
53657a31b7
|
Try to fix documentation build, add better docstrings to public optimizer api
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
ee5f6e76bb
|
A bit of an optimizer overhaul, added an improved factory, list_optimizers, class helper and add info classes with descriptions, arg configs
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
c1cf8c52b9
|
Update adafactor comments / attrib
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
94e0560aba
|
Remove an indent level in init_group for adopt, update optim tests, adopt failing rosenbrock
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
ff136b8d3a
|
Fix ADOPT on older PyTorch (tested back to 1.13)
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
79abc25f55
|
Add ADOPT optimizer
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
36a45e5d94
|
Improve row/col dim var name
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
e7b0480381
|
Cleanup original adafactor impl, add row/col dim heuristic that works with both conv and linear layers
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
1409ce2dbe
|
Change eps defaults in adafactor_bv again after some checking
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
9d8ccd2ba7
|
A bit of lars/lamb cleanup, torch.where supports scalars properly now, make lamb grad clipping optional, clean it up a bit
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
7cfaeced67
|
Change adafactor_bv epsilon default
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
0b5ae49251
|
Remove adafactorbv numpy dep, hack fix for loading optimizer state w/ half prec momentum (need better one)
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
19090ea966
|
Need to init momentum with correct dtype
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
484a88f4b4
|
Remove unused beta2 fn, make eps grad^2 handling same across factorized and non-factorized cases
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
7c16adca83
|
An impl of adafactor as per big vision (scaling vit) changes
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
711c5dee6d
|
Update sgdw for older pytorch
|
2023-12-11 12:10:29 -08:00 |
Ross Wightman
|
17a47c0e35
|
Add SGDW optimizer
|
2023-12-11 12:10:29 -08:00 |
alec.tu
|
942726db31
|
import lion in __init__.py
|
2023-07-27 09:26:57 +08:00 |
Ross Wightman
|
2d597b126d
|
Missed extra nadam algo step for capturable path
|
2023-06-13 20:51:31 -07:00 |
Ross Wightman
|
4790c0fa16
|
Missed nadamw.py
|
2023-06-13 20:45:58 -07:00 |
Ross Wightman
|
dab0360e00
|
Add NadamW based on mlcommons algorithm, added multi-tensor step
|
2023-06-13 20:45:17 -07:00 |
Ross Wightman
|
700aebcdc4
|
Fix Pytorch 2.0 breakage for Lookahead optimizer adapter
|
2023-06-02 08:39:07 -07:00 |
Ross Wightman
|
7cea88e2c4
|
Pop eps for lion optimizer
|
2023-05-21 15:20:03 -07:00 |
Ross Wightman
|
e3363a7159
|
Support bitsandbytes optimizers in factory
|
2023-05-09 11:33:51 -07:00 |
Ross Wightman
|
f35d6ea57b
|
Add multi-tensor (foreach) version of Lion in style of upcoming PyTorch 2.0 optimizers
|
2023-02-16 15:48:00 -08:00 |
Ross Wightman
|
709d5e0d9d
|
Add Lion optimizer
|
2023-02-14 23:55:05 -08:00 |
alec.tu
|
74d6afb4cd
|
Add Adan to __init__.py
|
2022-12-15 11:37:29 +08:00 |
Ross Wightman
|
927f031293
|
Major module / path restructure, timm.models.layers -> timm.layers, add _ prefix to all non model modules in timm.models
|
2022-12-06 15:00:06 -08:00 |
Ross Wightman
|
b1b024dfed
|
Scheduler update, add v2 factory method, support scheduling on updates instead of just epochs. Add LR to summary csv. Add lr_base scaling calculations to train script. Fix #1168
|
2022-10-07 10:43:04 -07:00 |
Ross Wightman
|
2a296412be
|
Add Adan optimizer
|
2022-09-23 16:05:52 -07:00 |
Ross Wightman
|
33e30f8c8b
|
Remove layer-decay print
|
2022-09-18 21:33:03 -07:00 |