Commit Graph

2575 Commits (a024ab3170c5d62c2ae5b28b60e980d5976abf97)
 

Author SHA1 Message Date
Ross Wightman a024ab3170 Replace radam & nadam impl with torch.optim ver, rename legacy adamw, nadam, radam impl in timm. Update optim factory & tests. 2024-11-26 15:10:15 -08:00
Ross Wightman 7b54eab807 Add MARS and LaProp impl, simplified from originals 2024-11-26 15:10:15 -08:00
Ross Wightman e5aea357b1 Update Adopt to include clipping for stability, separate wd so no param decay if update not taken on first step 2024-11-26 15:10:15 -08:00
Ross Wightman 444c506ce3
Merge pull request #2346 from JohannesTheo/patch-1
Update timm torchvision resnet weight urls to the updated urls in torchvision
2024-11-26 11:15:17 -08:00
Johannes 093a234d01
Update torchvision resnet legacy weight urls in resnet.py 2024-11-26 15:53:54 +01:00
Ross Wightman 2fcf73e580 Add mini imagenet info files 2024-11-25 10:53:28 -08:00
Ross Wightman 900d2b508d add mnv4 conv_medium in12k -> in1k ft 2024-11-22 16:31:45 -08:00
Ross Wightman 6bcbdbfe41 CS3-DarkNet Small (Focus) w/ RA4 recipe. Fix #2122 2024-11-22 16:31:45 -08:00
Sina Hajimiri 3a6cc4fb17 Improve wandb logging 2024-11-20 21:04:07 -08:00
Ross Wightman 620cb4f3cb Improve the parsable results dump at end of train, stop excessive output, only display top-10. 2024-11-20 16:47:06 -08:00
Ross Wightman 36b5d1adaa In dist training, update loss running avg every step, only sync on log updates / final. 2024-11-20 16:47:06 -08:00
Ross Wightman ae0737f5d0 Typo 2024-11-17 13:54:50 -08:00
Ross Wightman 84049d7f1e Missed input_size pretraind_cfg metadata for v2 34d @ 384 2024-11-17 12:44:08 -08:00
Ross Wightman b7a4b49ae6 Add some 384x384 small model weights, 3 variants of mnv4 conv medium on in12k pretrain, and resnetv2-34d on in1k 2024-11-17 12:14:39 -08:00
Alina facae65947 Update CODE_OF_CONDUCT.md 2024-11-17 11:43:39 -08:00
Alina Imtiaz 165c3dea98 Add CODE_OF_CONDUCT.md and CITATION.cff files 2024-11-17 11:43:39 -08:00
Antoine Broyelle 74196aceda Add py.typed file as recommended by PEP 561 2024-11-14 11:26:00 -08:00
Ross Wightman e35ea733ab Fix compiler check for adopt so it doesn't fail for torch >= 2 but less than recent with .is_compiling() 2024-11-13 11:24:01 -08:00
Ross Wightman 0b5264a108 Missing optimizers in __init__.py, add bind_defaults=False for unit tests 2024-11-13 10:50:46 -08:00
Ross Wightman d0161f303a Small optim factory tweak. default bind_defaults=True for get_optimizer_class 2024-11-13 10:45:48 -08:00
Ross Wightman ef062eefe3
Update README.md 2024-11-13 10:21:51 -08:00
Ross Wightman 3bef09f831 Tweak a few docstrings 2024-11-13 10:12:31 -08:00
Ross Wightman 015ac30a91
Update README.md 2024-11-13 08:20:20 -08:00
Ross Wightman 8b9b6824ae Minor changes, has_eps=False missing for bnb lion 2024-11-12 20:49:01 -08:00
Ross Wightman 61305cc26a Fix adopt descriptions 2024-11-12 20:49:01 -08:00
Ross Wightman ce42cc4846 Another doc class typo 2024-11-12 20:49:01 -08:00
Ross Wightman dde990785e More fixes for new factory & tests, add back adahessian 2024-11-12 20:49:01 -08:00
Ross Wightman 45490ac52f Post merge fix reference of old param groups helper fn locations 2024-11-12 20:49:01 -08:00
Ross Wightman 53657a31b7 Try to fix documentation build, add better docstrings to public optimizer api 2024-11-12 20:49:01 -08:00
Ross Wightman ee5f6e76bb A bit of an optimizer overhaul, added an improved factory, list_optimizers, class helper and add info classes with descriptions, arg configs 2024-11-12 20:49:01 -08:00
Ross Wightman c1cf8c52b9 Update adafactor comments / attrib 2024-11-12 20:49:01 -08:00
Ross Wightman 94e0560aba Remove an indent level in init_group for adopt, update optim tests, adopt failing rosenbrock 2024-11-12 20:49:01 -08:00
Ross Wightman ff136b8d3a Fix ADOPT on older PyTorch (tested back to 1.13) 2024-11-12 20:49:01 -08:00
Ross Wightman 79abc25f55 Add ADOPT optimizer 2024-11-12 20:49:01 -08:00
Ross Wightman 36a45e5d94 Improve row/col dim var name 2024-11-12 20:49:01 -08:00
Ross Wightman e7b0480381 Cleanup original adafactor impl, add row/col dim heuristic that works with both conv and linear layers 2024-11-12 20:49:01 -08:00
Ross Wightman 1409ce2dbe Change eps defaults in adafactor_bv again after some checking 2024-11-12 20:49:01 -08:00
Ross Wightman 9d8ccd2ba7 A bit of lars/lamb cleanup, torch.where supports scalars properly now, make lamb grad clipping optional, clean it up a bit 2024-11-12 20:49:01 -08:00
Ross Wightman 7cfaeced67 Change adafactor_bv epsilon default 2024-11-12 20:49:01 -08:00
Ross Wightman 0b5ae49251 Remove adafactorbv numpy dep, hack fix for loading optimizer state w/ half prec momentum (need better one) 2024-11-12 20:49:01 -08:00
Ross Wightman 19090ea966 Need to init momentum with correct dtype 2024-11-12 20:49:01 -08:00
Ross Wightman 484a88f4b4 Remove unused beta2 fn, make eps grad^2 handling same across factorized and non-factorized cases 2024-11-12 20:49:01 -08:00
Ross Wightman 7c16adca83 An impl of adafactor as per big vision (scaling vit) changes 2024-11-12 20:49:01 -08:00
mrT23 e31e5d2d64 imports 2024-11-12 07:53:39 -08:00
Tal 68d5a64e45 extend existing unittests 2024-11-12 07:53:39 -08:00
Ross Wightman 9f5c279bad Update log to describe scheduling behaviour diff w/ warmup_prefix 2024-11-08 11:01:11 -08:00
Ross Wightman 363b043c13 Extend train epoch schedule by warmup_epochs if warmup_prefix enable, allows schedule to reach end w/ prefix enabledy 2024-11-08 11:01:11 -08:00
Augustin Godinot 7f0c1b1f30 Add trust_remote_code argument to ReaderHfds 2024-11-08 08:16:36 -08:00
Wojtek Jasiński eb94efb218 fix pos embed dynamic resampling for eva 2024-11-06 16:03:27 -08:00
Wojtek Jasiński 3c7822c621 fix pos embed dynamic resampling for deit 2024-11-06 16:03:27 -08:00