Ross Wightman
67332fce24
Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights.
2024-04-30 16:56:33 -07:00
SmilingWolf
59cb0be595
SwinV2: add configurable act_layer argument
...
Defaults to "gelu", but makes it possible to pass "gelu_tanh".
Makes it easier to port weights from JAX/Flax, where the tanh
approximation is the default.
2024-03-05 22:04:17 +01:00
Ross Wightman
88889de923
Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg
2024-01-27 13:48:33 -08:00
Ross Wightman
d4386219c6
Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero
2024-01-26 16:35:51 -08:00
Li zhuoqun
7da34a999a
add type annotations in the code of swin_transformer_v2
2023-12-15 09:31:25 -08:00
akiyuki ishikawa
4f2e1bf4cb
Add missing docs in SwinTransformerStage
2023-12-05 12:00:51 -08:00
Ross Wightman
6bae514656
Add pretrained patch embed resizing to swin
2023-09-27 10:27:28 -07:00
Ross Wightman
c153cd4a3e
Add more advanced interpolation method from BEiT and support non-square window & image size adaptation for
...
* beit/beit-v2
* maxxvit/coatnet
* swin transformer
And non-square windows for swin-v2
2023-08-08 16:41:16 -07:00
Ross Wightman
7790ea709b
Add support for resizing swin transformer img_size and window_size on init and load from pretrained weights. Add support for non-square window_size to both swin v1/v2
2023-08-04 22:10:46 -07:00
Ross Wightman
e4e43190ce
Add typing to all model entrypoint fns, add old cache check env var to builder
2023-05-08 08:52:38 -07:00
Ross Wightman
80b247d843
Update swin_v2 attn_mask buffer change in #1790 to apply to updated checkpoints in hub
2023-04-11 14:40:32 -07:00
Ross Wightman
1a1aca0cee
Merge pull request #1761 from huggingface/patch_drop_refactor
...
Implement patch dropout for eva / vision_transformer, refactor dropout args
2023-04-11 14:37:36 -07:00
Ross Wightman
4d135421a3
Implement patch dropout for eva / vision_transformer, refactor / improve consistency of dropout args across all vit based models
2023-04-07 20:27:23 -07:00
Marco Forte
c76818a592
skip attention mask buffers
...
Allows more flexibility in the resolutions accepted by SwinV2.
2023-04-07 18:50:02 +02:00
Ross Wightman
1bb3989b61
Improve kwarg passthrough for swin, vit, deit, beit, eva
2023-04-05 21:37:16 -07:00
Ross Wightman
572f05096a
Swin and FocalNet weights on HF hub. Add model deprecation functionality w/ some registry tweaks.
2023-03-18 14:55:09 -07:00
Ross Wightman
acfd85ad68
All swin models support spatial output, add output_fmt to v1/v2 and use ClassifierHead.
...
* update ClassifierHead to allow different input format
* add output format support to patch embed
* fix some flatten issues for a few conv head models
* add Format enum and helpers for tensor format (layout) choices
2023-03-15 23:21:51 -07:00
Ross Wightman
7d9e321b76
Improve tracing of window attn models with simpler reshape logic
2023-02-17 07:59:06 -08:00
Ross Wightman
927f031293
Major module / path restructure, timm.models.layers -> timm.layers, add _ prefix to all non model modules in timm.models
2022-12-06 15:00:06 -08:00
Ross Wightman
4b30bae67b
Add updated vit_relpos weights, and impl w/ support for official swin-v2 differences for relpos. Add bias control support for MLP layers
2022-05-13 13:53:57 -07:00
Ross Wightman
d4c0588012
Remove persistent buffers from Swin-V2. Change SwinV2Cr cos attn + tau/logit_scale to match official, add ckpt convert, init_value zeros resid LN weight by default
2022-05-13 10:50:59 -07:00
Ross Wightman
27c42f0830
Fix torchscript use for offician Swin-V2, add support for non-square window/shift to WindowAttn/Block
2022-05-13 09:29:33 -07:00
Ross Wightman
c0211b0bf7
Swin-V2 test fixes, typo
2022-05-12 22:31:55 -07:00
Ross Wightman
9a86b900fa
Official SwinV2 models
2022-05-12 15:05:10 -07:00