2544 Commits

Author SHA1 Message Date
Ross Wightman
94e0560aba Remove an indent level in init_group for adopt, update optim tests, adopt failing rosenbrock 2024-11-12 20:49:01 -08:00
Ross Wightman
ff136b8d3a Fix ADOPT on older PyTorch (tested back to 1.13) 2024-11-12 20:49:01 -08:00
Ross Wightman
79abc25f55 Add ADOPT optimizer 2024-11-12 20:49:01 -08:00
Ross Wightman
36a45e5d94 Improve row/col dim var name 2024-11-12 20:49:01 -08:00
Ross Wightman
e7b0480381 Cleanup original adafactor impl, add row/col dim heuristic that works with both conv and linear layers 2024-11-12 20:49:01 -08:00
Ross Wightman
1409ce2dbe Change eps defaults in adafactor_bv again after some checking 2024-11-12 20:49:01 -08:00
Ross Wightman
9d8ccd2ba7 A bit of lars/lamb cleanup, torch.where supports scalars properly now, make lamb grad clipping optional, clean it up a bit 2024-11-12 20:49:01 -08:00
Ross Wightman
7cfaeced67 Change adafactor_bv epsilon default 2024-11-12 20:49:01 -08:00
Ross Wightman
0b5ae49251 Remove adafactorbv numpy dep, hack fix for loading optimizer state w/ half prec momentum (need better one) 2024-11-12 20:49:01 -08:00
Ross Wightman
19090ea966 Need to init momentum with correct dtype 2024-11-12 20:49:01 -08:00
Ross Wightman
484a88f4b4 Remove unused beta2 fn, make eps grad^2 handling same across factorized and non-factorized cases 2024-11-12 20:49:01 -08:00
Ross Wightman
7c16adca83 An impl of adafactor as per big vision (scaling vit) changes 2024-11-12 20:49:01 -08:00
mrT23
e31e5d2d64 imports 2024-11-12 07:53:39 -08:00
Tal
68d5a64e45 extend existing unittests 2024-11-12 07:53:39 -08:00
Ross Wightman
9f5c279bad Update log to describe scheduling behaviour diff w/ warmup_prefix 2024-11-08 11:01:11 -08:00
Ross Wightman
363b043c13 Extend train epoch schedule by warmup_epochs if warmup_prefix enable, allows schedule to reach end w/ prefix enabledy 2024-11-08 11:01:11 -08:00
Augustin Godinot
7f0c1b1f30 Add trust_remote_code argument to ReaderHfds 2024-11-08 08:16:36 -08:00
Wojtek Jasiński
eb94efb218 fix pos embed dynamic resampling for eva 2024-11-06 16:03:27 -08:00
Wojtek Jasiński
3c7822c621 fix pos embed dynamic resampling for deit 2024-11-06 16:03:27 -08:00
Wojtek Jasiński
3ae3f44288 Fix positional embedding resampling for non-square inputs in ViT 2024-11-06 16:03:27 -08:00
Josua Rieder
51ac8d2efb fix typo in train.py: bathes > batches 2024-11-05 08:53:55 -08:00
Josua Rieder
7e5477acf5 Replace deprecated positional argument with --data-dir 2024-11-05 08:53:36 -08:00
Ross Wightman
d4dde48dd5 Missed first_conv from resnet18d 2024-10-31 19:29:53 -07:00
Ross Wightman
e6263bf64d Add resnet and resnet-v2 18/34 weights trained with mnv4 small based recipe 2024-10-31 16:39:35 -07:00
Ross Wightman
f5b58e31a2 Allow non train mode for wds reader to operate w/o sample count, exhaust iterator 2024-10-31 16:39:35 -07:00
Ross Wightman
f689c850b9 One more small c&p issue 2024-10-23 21:51:09 -07:00
Ross Wightman
baa7242dd3 Fix c&p error, slight reformat 2024-10-23 21:51:09 -07:00
Ross Wightman
1b5cae681c Update some clip pretrained weights to point to new hub locations, add a few missing weights 2024-10-23 21:51:09 -07:00
Ross Wightman
310ffa32c5
Update version.py
dev version 1.0.12.dev0
2024-10-19 09:56:17 -07:00
Ross Wightman
c93567280f
Update README.md 2024-10-19 08:23:54 -07:00
Ross Wightman
5081b53e48
Merge pull request #2308 from huggingface/device_amp_cleanup
Cleanup some amp related behaviour to better support different (non-cuda) devices
2024-10-19 08:19:27 -07:00
Ross Wightman
c3992d5c4c Remove extra space 2024-10-18 14:54:16 -07:00
Ross Wightman
015fbe457a Merge branch 'MengqingCao-npu_support' into device_amp_cleanup 2024-10-18 14:50:44 -07:00
Ross Wightman
81b59faf77 Merge branch 'npu_support' of github.com:MengqingCao/pytorch-image-models into MengqingCao-npu_support 2024-10-18 14:50:00 -07:00
Ross Wightman
1766a01f96 Cleanup some amp related behaviour to better support different (non-cuda) devices 2024-10-18 13:54:16 -07:00
MengqingCao
37c731ca37 fix device check 2024-10-17 12:38:02 +00:00
Ross Wightman
a852318b63
Merge pull request #2305 from NightMachinery/patch-2
mambaout.py: fixed bug
2024-10-16 14:39:43 -07:00
Feraidoon Mehri
ca20e102fe
mambaout.py: fixed bug 2024-10-17 01:03:28 +03:30
Ross Wightman
8cb2548962 Version 1.0.11 v1.0.11 2024-10-16 14:14:44 -07:00
Ross Wightman
65e8e9ca12
Merge pull request #2304 from huggingface/intern300m
Add intern300m vit w/ converted timm weights. Fix #2300
2024-10-16 14:00:22 -07:00
Ross Wightman
89dffc5ff0 Another small fix for original mambaout models, no classifier nn.Linear when num_classe=0 on init 2024-10-16 12:36:36 -07:00
Ross Wightman
fad4538801 Elevate import deprecation warnings from DeprecationWarning to FutureWarning so messages are now seen 2024-10-16 11:30:01 -07:00
Ross Wightman
a1f379e712 Add intern300m vit w/ converted timm weights. Fix #2300 2024-10-16 10:29:06 -07:00
MengqingCao
234f975787 add npu support 2024-10-16 07:13:45 +00:00
Ross Wightman
60f517c883 Fix wrong name in _all_ for models._registry 2024-10-15 07:39:46 -07:00
Ross Wightman
b4a9a166c3 Version 1.0.10 v1.0.10 2024-10-14 21:40:30 -07:00
Ross Wightman
c3052fa19e
Merge pull request #2298 from huggingface/preact_resnet18
Add resnet18/18d pre-act model configs for potential training.
2024-10-14 19:39:04 -07:00
Ross Wightman
abdf33145c Add 34/34d pre-act resnet variants 2024-10-14 13:23:50 -07:00
Ross Wightman
c82ce86f8f Add 384x384 mambaout_base_plus model weights 2024-10-14 12:28:57 -07:00
Ross Wightman
2703d155c8
Update README.md 2024-10-11 16:59:06 -07:00