Default Branch

a22366e3ce · Merge pull request #2503 from huggingface/beit3_remap_clean · Updated 2025-05-31 07:40:28 +08:00

Branches

eb84e4b571 · Switch hf hub entries for new aimv2 / dfn weights to point to timm locations. Undo forced device for SDR linspace, part of another change. · Updated 2024-12-31 09:06:34 +08:00

156
10

5cf022f228 · Add more pali(2) weights. Switch rest of models adapting open_clip weights to their own weight instances. · Updated 2024-12-28 04:05:22 +08:00

162
4

d285526dc9 · Lazy loader for TF, more LAB fiddling · Updated 2024-12-24 05:24:11 +08:00

164
3

7573096eb8 · Make sure trust_remote code only passed to HF datasets. Improve some docstrings. · Updated 2024-12-07 03:40:04 +08:00

165
0
Included

9cec2f17cd · Merge pull request #2358 from turicas/cache_dir · Updated 2024-12-07 02:25:29 +08:00

174
4

e90b68b603 · Rename inception_next_atto pretrained str · Updated 2024-12-07 02:08:03 +08:00

174
2

afdf11d9ae · Add caution to Adan. Add decouple decay option to LAMB. · Updated 2024-12-06 05:50:30 +08:00

173
0
Included

ceaff7668e · See if we can avoid some model / layer pickle issues with the aa attr in ConvNormAct · Updated 2024-12-03 08:55:29 +08:00

176
1

9fc8bac3d2 · Add cautious mars, improve test reliability by skipping grad diff for first step · Updated 2024-12-03 01:38:25 +08:00

178
1

9b27f84876 · To be technically correct, need to check the in-place _ ver of op · Updated 2024-11-29 05:46:17 +08:00

183
3

1a70036691 · Keep basic optim test LR range closer to before w/ updated code · Updated 2024-11-27 05:40:20 +08:00

195
10

54bdab7411 · add mnv4 conv_medium in12k -> in1k ft · Updated 2024-11-23 07:24:53 +08:00

198
2

77e3922f02 · Improve the parsable results dump at end of train, stop excessive output, only display top-10. · Updated 2024-11-21 08:43:16 +08:00

201
2

85255fdf57 · Add some 384x384 small model weights, 3 variants of mnv4 conv medium on in12k pretrain, and resnetv2-34d on in1k · Updated 2024-11-18 03:28:33 +08:00

206
1

df6171a843 · Minor changes, has_eps=False missing for bnb lion · Updated 2024-11-13 12:06:30 +08:00

233
21

98ff128d4a · Update log to describe scheduling behaviour diff w/ warmup_prefix · Updated 2024-11-09 02:57:29 +08:00

238
2

ea8b030b81 · Add resnet and resnet-v2 18/34 weights trained with mnv4 small based recipe · Updated 2024-11-01 06:40:36 +08:00

246
2

c9c973b3b5 · Experimenting with differential attention · Updated 2024-10-28 02:51:11 +08:00

246
1

0444a69745 · One more small c&p issue · Updated 2024-10-24 12:50:12 +08:00

249
3

c3992d5c4c · Remove extra space · Updated 2024-10-19 05:54:16 +08:00

252
0
Included