1769 Commits

Author SHA1 Message Date
Ross Wightman
5f85f8eefa Fix comment, add 'stochastic weight decay' idea because why not 2025-01-30 15:44:02 -08:00
Ross Wightman
5940cc167f Change start/end args 2025-01-30 15:44:02 -08:00
Ross Wightman
3be8b1abe4 Change flattening behaviour in Kron 2025-01-30 15:44:02 -08:00
Ross Wightman
b3a83b81d6 Prep Kron for merge, add detail to attributions note, README. 2025-01-27 21:02:26 -08:00
Ross Wightman
67ef6f0a92 Move opt_einsum import back out of class __init__ 2025-01-27 21:02:26 -08:00
Ross Wightman
9ab5464e4d More additions to Kron 2025-01-27 21:02:26 -08:00
Ross Wightman
5f10450235 Some more kron work. Figured out why some tests fail, implemented a deterministic rng state load but too slow so skipping some tests for now. 2025-01-27 21:02:26 -08:00
Ross Wightman
cd21e80d03 Fiddling with Kron (PSGD) 2025-01-27 21:02:26 -08:00
Adam J. Stewart
d81da93c16 Use import alias 2025-01-22 10:27:17 -08:00
Adam J. Stewart
4de1abf837 timm: add __all__ to __init__ 2025-01-22 10:27:17 -08:00
Ryan
17eabaad17 Fix RDNet forward call 2025-01-21 11:52:05 -08:00
Ryan
80a4877376 Fix self.reset_classifier num_classes update 2025-01-21 11:52:05 -08:00
Collin McCarthy
84631cb5c6 Add missing training flag to convert_sync_batchnorm 2025-01-21 11:51:55 -08:00
Ross Wightman
5d535d7a2d Version 1.0.14, update README & changelog 2025-01-19 13:53:09 -08:00
Ross Wightman
aa333079da Tweak so150m2 def 2025-01-19 13:40:53 -08:00
Josua Rieder
8d81fdf3d9 Fix typos 2025-01-19 13:39:40 -08:00
Ross Wightman
3677f67902 Add the 256x256 in1k ft of the so150m, add an alternate so150m def 2025-01-18 15:51:57 -08:00
Ross Wightman
2a84d68d02 Add some so150m vit w/ sbb recipe weights, and a ese_vovnet57b model with RA4 recipe 2025-01-18 15:51:57 -08:00
Ross Wightman
9265d54a3a LeViT safetensors load is broken by conversion code that wasn't deactivated 2025-01-16 11:37:00 -08:00
Ross Wightman
21e75a9d25
Update version.py
Back to dev version
2025-01-16 11:23:17 -08:00
Adam J. Stewart
6d21eb0d37
VGG ConvMlp: fix layer defaults/types 2025-01-15 12:11:56 +01:00
Adam J. Stewart
f5c4d5cbb7
Add missing imports 2025-01-11 15:13:16 +01:00
Adam J. Stewart
19aaea3c8f
Fix nn.Module type hints 2025-01-11 15:09:21 +01:00
Ross Wightman
47811bc05a Update README, bump version to 1.0.13 non-dev 2025-01-09 09:33:59 -08:00
Ross Wightman
deb9895600 Update checkpoint save to fix old hard-link + fuse issue I ran into again... fix #340 2025-01-08 15:36:58 -08:00
Ross Wightman
92f610c982 Add half-precision (bfloat16, float16) support to train & validate scripts. Should push dtype handling into model factory / pretrained load at some point... 2025-01-07 10:25:14 -08:00
Ross Wightman
155f6e7fea Update README, few minor fixups. 2025-01-06 13:09:15 -08:00
Ross Wightman
2b251fb291 Wrap torch checkpoint() fn to default use_reentrant flag to False and allow env var override 2025-01-06 11:28:39 -08:00
Ross Wightman
131518c15c Add comments to MLP layers re expected layouts 2025-01-02 09:41:35 -08:00
Louis Lac
2d5277e858
Merge branch 'main' into fix-mqa-v2 2025-01-02 00:11:22 +01:00
Louis Lac
2d734d9058 Fixed unfused attn2d scale 2025-01-01 12:34:07 -08:00
Louis Lac
6171e756d3 Fix MQA V2 scale and out shape 2025-01-01 15:37:28 +01:00
Ross Wightman
e846b2cf28 Add 384x384 in12k pretrain and finetune for convnext_nano 2024-12-31 13:16:43 -08:00
Ross Wightman
b0068ba5d0 Switch hf hub entries for new aimv2 / dfn weights to point to timm locations. Undo forced device for SDR linspace, part of another change. 2024-12-30 19:24:21 -08:00
Ross Wightman
1bf84b35c3 Update tests for aimv2 filtering 2024-12-30 19:24:21 -08:00
Ross Wightman
b33418713a Add (almost) full set of aimv2 model instances. Switch back to unpacked SwiGLU. Verify correctness. Add DFN L/14 39B weight. 2024-12-30 19:24:21 -08:00
Ross Wightman
de35fd87f5 Add SimpleNorm to create_norm factory 2024-12-30 19:24:21 -08:00
Ross Wightman
d5375ca769 Use torch F.rms_norm when possible, select fast vs normal paths appropriately and test with torchscript 2024-12-30 19:24:21 -08:00
Ross Wightman
5f12a25114 Add bias arg to Vitamin GeGLU 2024-12-30 19:24:21 -08:00
Ross Wightman
5804d92e4b Switch aimv2 to used packed SwiGLU 2024-12-30 19:24:21 -08:00
Ross Wightman
15406a939e Fixing RmsNorm to fix #2380 and noticed with aimv2 when comparing outputs. Still some work to do, need to look at AMP / fast mode behaviour, dispatch to torch when possible. Add SimpleNorm for 'LayerNorm w/o centering and bias' 2024-12-30 19:24:21 -08:00
Ross Wightman
a648a04834 Supporting aimv2 encoders 2024-12-30 19:24:21 -08:00
Ross Wightman
790decc89b Add more pali(2) weights. Switch rest of models adapting open_clip weights to their own weight instances. 2024-12-27 14:00:41 -08:00
Ross Wightman
01cf0f72af Add support for tag, license customization through push_to_hub 2024-12-27 14:00:41 -08:00
Ross Wightman
b12ecbd614 Move siglip timm weights to own repos 2024-12-27 14:00:41 -08:00
Ross Wightman
6fb7aaf37d Switching to timm specific weight instances for open_clip image encoders to facilitate hf-hub: use in timm and new transformers TimmWrapper 2024-12-27 14:00:41 -08:00
Ross Wightman
364c567dd2
Merge pull request #2357 from huggingface/more_opt_stuff
Add caution to Adan. Add decouple decay option to LAMB.
2024-12-27 12:54:02 -08:00
Ryan
ab0a70dfff fix feature_info.reduction 2024-12-18 21:12:40 +08:00
Ross Wightman
7573096eb8 Make sure trust_remote code only passed to HF datasets. Improve some docstrings. 2024-12-06 11:40:04 -08:00
Ross Wightman
9eee47de52 Back to dev version 2024-12-06 10:44:41 -08:00