Ross Wightman
962958723c
More Hiera updates. Add forward_intermediates to hieradat/sam2 impl. Make both use same classifier module. Add coarse bool to intermediates.
2024-08-16 11:10:04 -07:00
Ross Wightman
f2cfb4c677
Add WIP HieraDet impl (SAM2 backbone support)
2024-08-15 17:58:15 -07:00
Ross Wightman
a50e53d41f
Rename global pos embed for Hiera abswin, factor out commonly used vit weight init fns to layers. Add a channels-last ver of normmlp head.
2024-08-15 17:46:36 -07:00
Ross Wightman
2f3fed43b8
Fix hiera init with num_classes=0, fix weight tag names for sbb2 hiera/vit weights, add LayerScale/LayerScale2d to layers
2024-08-15 11:14:38 -07:00
Ross Wightman
ab8cb070fc
Add xavier_uniform init of MNVC hybrid attention modules. Small improvement in training stability.
2024-07-26 17:03:40 -07:00
Ross Wightman
cec70b6779
Merge pull request #2225 from huggingface/small_things
...
Small things
2024-07-25 20:29:13 -07:00
Ross Wightman
7e0caa1ba3
Padding helpers work if tuples/lists passed
2024-07-19 14:28:03 -07:00
Ross Wightman
2180800646
MQA query_strides bugs fix #2237 . No padding for avg_pool2d if not 'same', use scale_factor for Upsample.
2024-07-19 14:26:54 -07:00
Ross Wightman
392b78aee7
set_input_size initial impl for vit & swin v1. Move HybridEmbed to own location in timm/layers
2024-07-17 15:25:48 -07:00
Ross Wightman
57adc1acc8
Fix rotary embed version of attn pool. Bit of cleanup/naming
2024-06-11 23:49:17 -07:00
Ross Wightman
cdc7bcea69
Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool.
2024-06-11 21:32:07 -07:00
Ross Wightman
30ffa152de
Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer.
2024-06-10 12:07:14 -07:00
Ross Wightman
5e9ff5798f
Adding pos embed resize fns to FX autowrap exceptions
2024-06-10 12:06:47 -07:00
Ross Wightman
f0fb471b26
Remove separate ConvNormActAa class, merge with ConvNormAct
2024-06-10 12:05:35 -07:00
Ross Wightman
5efa15b2a2
Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731
2024-06-09 16:54:48 -07:00
Ross Wightman
cc8a03daac
Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers
2024-06-06 09:15:27 -07:00
Ross Wightman
5fa6efa158
Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
...
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman
7fe96e7a92
More MobileNet-v4 fixes
...
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman
70176a2dae
torchscript typing fixes
2024-05-23 11:43:05 -07:00
Ross Wightman
2a1a6b1236
Adding missing attention2d.py
2024-05-23 11:06:32 -07:00
Ross Wightman
cee79dada0
Merge remote-tracking branch 'origin/main' into efficientnet_x
2024-05-23 11:01:39 -07:00
Ross Wightman
6a8bb03330
Initial MobileNetV4 pass
2024-05-23 10:49:18 -07:00
Fernando Cossio
9b11801cb4
Credit earlier work with the same idea.
...
Hi, this earlier work has the same name and idea behind this layer. It could be useful for readers to keep both links here if they want to see the effects of introducing this layer on a very different domain. 😄
2024-05-16 22:50:34 +02:00
Ross Wightman
211d18d8ac
Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming
2024-05-11 23:37:35 -07:00
Ross Wightman
2bfa5e5d74
Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule
2024-05-06 16:32:49 -07:00
Ross Wightman
301d0bb21f
Stricter check on pool_type for adaptive pooling module. Fix #2159
2024-05-03 16:16:51 -07:00
Ross Wightman
4b2565e4cb
More forward_intermediates() / FeatureGetterNet work
...
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman
d6c2cc91af
Make NormMlpClassifier head reset args consistent with ClassifierHead
2024-02-10 16:25:33 -08:00
Ross Wightman
7bc7798d0e
Type annotation correctness for create_act
2024-02-10 14:57:58 -08:00
Ross Wightman
88889de923
Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg
2024-01-27 13:48:33 -08:00
Ross Wightman
d4386219c6
Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero
2024-01-26 16:35:51 -08:00
kalazus
7f19a4cce7
fix fast catavgmax selection
2024-01-16 10:30:08 -08:00
Ross Wightman
df7ae11eb2
Add device arg for patch embed resize, fix #2024
2023-12-04 11:42:13 -08:00
Ross Wightman
9fab8d8f58
Fix break of 2 years old torchvision installs :/
2023-11-04 02:32:09 -07:00
Ross Wightman
f7762fee78
Consistency handling None / empty string inputs to norm / act create fns
2023-11-03 11:01:41 -07:00
Ross Wightman
a2e4a4c148
Add quickgelu vit clip variants, simplify get_norm_layer and allow string args in vit norm/act. Add metaclip CLIP weights
2023-11-03 11:01:41 -07:00
a-r-r-o-w
d5f1525334
include suggestions from review
...
Co-Authored-By: Ross Wightman <rwightman@gmail.com>
2023-10-30 13:47:54 -07:00
a-r-r-o-w
5f14bdd564
include typing suggestions by @rwightman
2023-10-30 13:47:54 -07:00
Laureηt
fe92fd93e5
fix adaptive_avgmax_pool.py
...
remove extra whitespace in `SelectAdaptivePool2d`'s `__repr__`
2023-10-29 23:03:36 -07:00
Tush9905
89ba0da910
Fixed Typos
...
Fixed the typos in helpers.py and CONTRIBUTING.md
2023-10-21 21:46:31 -07:00
Ross Wightman
49a459e8f1
Merge remote-tracking branch 'upstream/main' into vit_siglip_and_reg
2023-10-17 09:36:48 -07:00
Ross Wightman
a58f9162d7
Missed __init__.py update for attention pooling layer add
2023-10-17 09:28:21 -07:00
Ross Wightman
71365165a2
Add SigLIP weights
2023-10-16 23:26:08 -07:00
lucapericlp
7ce65a83a2
Removing unused self.drop
2023-10-05 11:20:57 -07:00
Ross Wightman
9caf32b93f
Move levit style pos bias resize with other rel pos bias utils
2023-09-01 11:05:56 -07:00
方曦
170a5b6e27
add tinyvit
2023-09-01 11:05:56 -07:00
Ross Wightman
fc5d705b83
dynamic_size -> dynamic_img_size, add dynamic_img_pad for padding option
2023-08-27 15:58:35 -07:00
Ross Wightman
1f4512fca3
Support dynamic_resize in eva.py models
2023-08-27 15:58:35 -07:00
Ross Wightman
fdd8c7c2da
Initial impl of dynamic resize for existing vit models (incl vit-resnet hybrids)
2023-08-27 15:58:35 -07:00
Ross Wightman
c153cd4a3e
Add more advanced interpolation method from BEiT and support non-square window & image size adaptation for
...
* beit/beit-v2
* maxxvit/coatnet
* swin transformer
And non-square windows for swin-v2
2023-08-08 16:41:16 -07:00