Ross Wightman
|
de35fd87f5
|
Add SimpleNorm to create_norm factory
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
d5375ca769
|
Use torch F.rms_norm when possible, select fast vs normal paths appropriately and test with torchscript
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
5f12a25114
|
Add bias arg to Vitamin GeGLU
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
5804d92e4b
|
Switch aimv2 to used packed SwiGLU
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
15406a939e
|
Fixing RmsNorm to fix #2380 and noticed with aimv2 when comparing outputs. Still some work to do, need to look at AMP / fast mode behaviour, dispatch to torch when possible. Add SimpleNorm for 'LayerNorm w/o centering and bias'
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
a648a04834
|
Supporting aimv2 encoders
|
2024-12-30 19:24:21 -08:00 |
ariG23498
|
3a6661ac78
|
fix broken image link
|
2024-12-30 07:38:31 -08:00 |
Ross Wightman
|
790decc89b
|
Add more pali(2) weights. Switch rest of models adapting open_clip weights to their own weight instances.
|
2024-12-27 14:00:41 -08:00 |
Ross Wightman
|
01cf0f72af
|
Add support for tag, license customization through push_to_hub
|
2024-12-27 14:00:41 -08:00 |
Ross Wightman
|
b12ecbd614
|
Move siglip timm weights to own repos
|
2024-12-27 14:00:41 -08:00 |
Ross Wightman
|
6fb7aaf37d
|
Switching to timm specific weight instances for open_clip image encoders to facilitate hf-hub: use in timm and new transformers TimmWrapper
|
2024-12-27 14:00:41 -08:00 |
Ross Wightman
|
364c567dd2
|
Merge pull request #2357 from huggingface/more_opt_stuff
Add caution to Adan. Add decouple decay option to LAMB.
|
2024-12-27 12:54:02 -08:00 |
Ross Wightman
|
a02b1a8e79
|
Merge pull request #2369 from brianhou0208/fix_reduction
Fix feature_info.reduction
|
2024-12-18 16:51:53 -08:00 |
Ryan
|
ab0a70dfff
|
fix feature_info.reduction
|
2024-12-18 21:12:40 +08:00 |
Ross Wightman
|
ea231079f5
|
Merge pull request #2361 from huggingface/grodino-dataset_trust_remote
Dataset trust remote tweaks
|
2024-12-06 12:06:56 -08:00 |
Ross Wightman
|
7573096eb8
|
Make sure trust_remote code only passed to HF datasets. Improve some docstrings.
|
2024-12-06 11:40:04 -08:00 |
Ross Wightman
|
95d903fd87
|
Merge branch 'main' of github.com:grodino/pytorch-image-models into grodino-dataset_trust_remote
|
2024-12-06 11:14:26 -08:00 |
Ross Wightman
|
9eee47de52
|
Back to dev version
|
2024-12-06 10:44:41 -08:00 |
Álvaro Justen (@turicas)
|
9383f2880d
|
Add cache_dir example
|
2024-12-06 10:39:13 -08:00 |
Ross Wightman
|
d1e9a8622a
|
Rename inception_next_atto pretrained str
|
2024-12-06 10:36:47 -08:00 |
Weihao Yu
|
0576175d85
|
Add inception_next_atto
|
2024-12-06 10:36:47 -08:00 |
Ross Wightman
|
7ab2b938e5
|
More tweaks to docstrings for hub/builder
|
2024-12-06 10:25:06 -08:00 |
Ross Wightman
|
dc1bb05e8e
|
Punch cache_dir through model factory / builder / pretrain helpers. Improve some annotations in related code.
|
2024-12-06 10:25:06 -08:00 |
Ross Wightman
|
afdf11d9ae
|
Add caution to Adan. Add decouple decay option to LAMB.
|
2024-12-05 13:50:30 -08:00 |
Ross Wightman
|
553ded5c6b
|
Version 1.0.12
|
2024-12-03 10:34:52 -08:00 |
Ross Wightman
|
464885e135
|
See if we can avoid some model / layer pickle issues with the aa attr in ConvNormAct
|
2024-12-03 08:02:55 -08:00 |
Ross Wightman
|
5fe5f9d488
|
Add a different mnv4 conv-small weight
|
2024-12-02 16:14:37 -08:00 |
Ross Wightman
|
303f7691a1
|
Add cautious mars, improve test reliability by skipping grad diff for first step
|
2024-12-02 11:29:02 -08:00 |
Ross Wightman
|
82e8677690
|
Make LaProp weight decay match typical PyTorch 'decoupled' behaviour where it's scaled by LR
|
2024-11-29 16:44:43 -08:00 |
Ross Wightman
|
886eb77938
|
Update README, missed small discrep in adafactor min dim update
|
2024-11-29 10:57:47 -08:00 |
Ross Wightman
|
e3e434bbc4
|
To be technically correct, need to check the in-place _ ver of op
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
7c32d3bd82
|
Work around _foreach_maximum issue, need scalar other support
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
7cf683628f
|
Cautious optimizer impl plus some typing cleanup.
|
2024-11-28 15:11:58 -08:00 |
Ross Wightman
|
aeb1ed7a15
|
Keep basic optim test LR range closer to before w/ updated code
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
7a165fcb62
|
Remove rogue import, thanks IDE :/
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
73d10ab482
|
Update tests, need handling for radamw with older PyTorch, need to back-off basic test LR in mars?
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
09bc21774e
|
Update optimizers.mdx
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
4f64ec4e14
|
Add guard around 'somewhat' newer torch RAdam / NAdam imports
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
0903d98162
|
Reduce tolerance on model inference 'owl' test, pillow output varies a lot, was failing locally
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
1ab02a11a1
|
Update Adan with newer impl (from original source) that includes multi-tensor fn
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
a024ab3170
|
Replace radam & nadam impl with torch.optim ver, rename legacy adamw, nadam, radam impl in timm. Update optim factory & tests.
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
7b54eab807
|
Add MARS and LaProp impl, simplified from originals
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
e5aea357b1
|
Update Adopt to include clipping for stability, separate wd so no param decay if update not taken on first step
|
2024-11-26 15:10:15 -08:00 |
Ross Wightman
|
444c506ce3
|
Merge pull request #2346 from JohannesTheo/patch-1
Update timm torchvision resnet weight urls to the updated urls in torchvision
|
2024-11-26 11:15:17 -08:00 |
Johannes
|
093a234d01
|
Update torchvision resnet legacy weight urls in resnet.py
|
2024-11-26 15:53:54 +01:00 |
Ross Wightman
|
2fcf73e580
|
Add mini imagenet info files
|
2024-11-25 10:53:28 -08:00 |
Ross Wightman
|
900d2b508d
|
add mnv4 conv_medium in12k -> in1k ft
|
2024-11-22 16:31:45 -08:00 |
Ross Wightman
|
6bcbdbfe41
|
CS3-DarkNet Small (Focus) w/ RA4 recipe. Fix #2122
|
2024-11-22 16:31:45 -08:00 |
Sina Hajimiri
|
3a6cc4fb17
|
Improve wandb logging
|
2024-11-20 21:04:07 -08:00 |
Ross Wightman
|
620cb4f3cb
|
Improve the parsable results dump at end of train, stop excessive output, only display top-10.
|
2024-11-20 16:47:06 -08:00 |