Alina Imtiaz
|
165c3dea98
|
Add CODE_OF_CONDUCT.md and CITATION.cff files
|
2024-11-17 11:43:39 -08:00 |
Antoine Broyelle
|
74196aceda
|
Add py.typed file as recommended by PEP 561
|
2024-11-14 11:26:00 -08:00 |
Ross Wightman
|
e35ea733ab
|
Fix compiler check for adopt so it doesn't fail for torch >= 2 but less than recent with .is_compiling()
|
2024-11-13 11:24:01 -08:00 |
Ross Wightman
|
0b5264a108
|
Missing optimizers in __init__.py, add bind_defaults=False for unit tests
|
2024-11-13 10:50:46 -08:00 |
Ross Wightman
|
d0161f303a
|
Small optim factory tweak. default bind_defaults=True for get_optimizer_class
|
2024-11-13 10:45:48 -08:00 |
Ross Wightman
|
ef062eefe3
|
Update README.md
|
2024-11-13 10:21:51 -08:00 |
Ross Wightman
|
3bef09f831
|
Tweak a few docstrings
|
2024-11-13 10:12:31 -08:00 |
Ross Wightman
|
015ac30a91
|
Update README.md
|
2024-11-13 08:20:20 -08:00 |
Ross Wightman
|
8b9b6824ae
|
Minor changes, has_eps=False missing for bnb lion
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
61305cc26a
|
Fix adopt descriptions
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
ce42cc4846
|
Another doc class typo
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
dde990785e
|
More fixes for new factory & tests, add back adahessian
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
45490ac52f
|
Post merge fix reference of old param groups helper fn locations
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
53657a31b7
|
Try to fix documentation build, add better docstrings to public optimizer api
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
ee5f6e76bb
|
A bit of an optimizer overhaul, added an improved factory, list_optimizers, class helper and add info classes with descriptions, arg configs
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
c1cf8c52b9
|
Update adafactor comments / attrib
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
94e0560aba
|
Remove an indent level in init_group for adopt, update optim tests, adopt failing rosenbrock
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
ff136b8d3a
|
Fix ADOPT on older PyTorch (tested back to 1.13)
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
79abc25f55
|
Add ADOPT optimizer
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
36a45e5d94
|
Improve row/col dim var name
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
e7b0480381
|
Cleanup original adafactor impl, add row/col dim heuristic that works with both conv and linear layers
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
1409ce2dbe
|
Change eps defaults in adafactor_bv again after some checking
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
9d8ccd2ba7
|
A bit of lars/lamb cleanup, torch.where supports scalars properly now, make lamb grad clipping optional, clean it up a bit
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
7cfaeced67
|
Change adafactor_bv epsilon default
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
0b5ae49251
|
Remove adafactorbv numpy dep, hack fix for loading optimizer state w/ half prec momentum (need better one)
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
19090ea966
|
Need to init momentum with correct dtype
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
484a88f4b4
|
Remove unused beta2 fn, make eps grad^2 handling same across factorized and non-factorized cases
|
2024-11-12 20:49:01 -08:00 |
Ross Wightman
|
7c16adca83
|
An impl of adafactor as per big vision (scaling vit) changes
|
2024-11-12 20:49:01 -08:00 |
mrT23
|
e31e5d2d64
|
imports
|
2024-11-12 07:53:39 -08:00 |
Tal
|
68d5a64e45
|
extend existing unittests
|
2024-11-12 07:53:39 -08:00 |
Ross Wightman
|
9f5c279bad
|
Update log to describe scheduling behaviour diff w/ warmup_prefix
|
2024-11-08 11:01:11 -08:00 |
Ross Wightman
|
363b043c13
|
Extend train epoch schedule by warmup_epochs if warmup_prefix enable, allows schedule to reach end w/ prefix enabledy
|
2024-11-08 11:01:11 -08:00 |
Augustin Godinot
|
2dff16fa58
|
Add --dataset-trust-remote-code to the train.py and validate.py scripts
|
2024-11-08 18:15:10 +01:00 |
Augustin Godinot
|
7f0c1b1f30
|
Add trust_remote_code argument to ReaderHfds
|
2024-11-08 08:16:36 -08:00 |
Wojtek Jasiński
|
eb94efb218
|
fix pos embed dynamic resampling for eva
|
2024-11-06 16:03:27 -08:00 |
Wojtek Jasiński
|
3c7822c621
|
fix pos embed dynamic resampling for deit
|
2024-11-06 16:03:27 -08:00 |
Wojtek Jasiński
|
3ae3f44288
|
Fix positional embedding resampling for non-square inputs in ViT
|
2024-11-06 16:03:27 -08:00 |
Josua Rieder
|
51ac8d2efb
|
fix typo in train.py: bathes > batches
|
2024-11-05 08:53:55 -08:00 |
Josua Rieder
|
7e5477acf5
|
Replace deprecated positional argument with --data-dir
|
2024-11-05 08:53:36 -08:00 |
Ross Wightman
|
d4dde48dd5
|
Missed first_conv from resnet18d
|
2024-10-31 19:29:53 -07:00 |
Ross Wightman
|
e6263bf64d
|
Add resnet and resnet-v2 18/34 weights trained with mnv4 small based recipe
|
2024-10-31 16:39:35 -07:00 |
Ross Wightman
|
f5b58e31a2
|
Allow non train mode for wds reader to operate w/o sample count, exhaust iterator
|
2024-10-31 16:39:35 -07:00 |
Ross Wightman
|
f689c850b9
|
One more small c&p issue
|
2024-10-23 21:51:09 -07:00 |
Ross Wightman
|
baa7242dd3
|
Fix c&p error, slight reformat
|
2024-10-23 21:51:09 -07:00 |
Ross Wightman
|
1b5cae681c
|
Update some clip pretrained weights to point to new hub locations, add a few missing weights
|
2024-10-23 21:51:09 -07:00 |
Ross Wightman
|
310ffa32c5
|
Update version.py
dev version 1.0.12.dev0
|
2024-10-19 09:56:17 -07:00 |
Ross Wightman
|
c93567280f
|
Update README.md
|
2024-10-19 08:23:54 -07:00 |
Ross Wightman
|
5081b53e48
|
Merge pull request #2308 from huggingface/device_amp_cleanup
Cleanup some amp related behaviour to better support different (non-cuda) devices
|
2024-10-19 08:19:27 -07:00 |
Ross Wightman
|
c3992d5c4c
|
Remove extra space
|
2024-10-18 14:54:16 -07:00 |
Ross Wightman
|
015fbe457a
|
Merge branch 'MengqingCao-npu_support' into device_amp_cleanup
|
2024-10-18 14:50:44 -07:00 |