Commit Graph

79 Commits (a2f539f0552a9958a1960c8f5079a8b3782eb803)

Author SHA1 Message Date
Ross Wightman 962958723c More Hiera updates. Add forward_intermediates to hieradat/sam2 impl. Make both use same classifier module. Add coarse bool to intermediates. 2024-08-16 11:10:04 -07:00
Ross Wightman f2cfb4c677 Add WIP HieraDet impl (SAM2 backbone support) 2024-08-15 17:58:15 -07:00
Ross Wightman a50e53d41f Rename global pos embed for Hiera abswin, factor out commonly used vit weight init fns to layers. Add a channels-last ver of normmlp head. 2024-08-15 17:46:36 -07:00
Ross Wightman 2f3fed43b8 Fix hiera init with num_classes=0, fix weight tag names for sbb2 hiera/vit weights, add LayerScale/LayerScale2d to layers 2024-08-15 11:14:38 -07:00
Ross Wightman ab8cb070fc Add xavier_uniform init of MNVC hybrid attention modules. Small improvement in training stability. 2024-07-26 17:03:40 -07:00
Ross Wightman cec70b6779
Merge pull request #2225 from huggingface/small_things
Small things
2024-07-25 20:29:13 -07:00
Ross Wightman 7e0caa1ba3 Padding helpers work if tuples/lists passed 2024-07-19 14:28:03 -07:00
Ross Wightman 2180800646 MQA query_strides bugs fix #2237. No padding for avg_pool2d if not 'same', use scale_factor for Upsample. 2024-07-19 14:26:54 -07:00
Ross Wightman 392b78aee7 set_input_size initial impl for vit & swin v1. Move HybridEmbed to own location in timm/layers 2024-07-17 15:25:48 -07:00
Ross Wightman 57adc1acc8 Fix rotary embed version of attn pool. Bit of cleanup/naming 2024-06-11 23:49:17 -07:00
Ross Wightman cdc7bcea69 Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool. 2024-06-11 21:32:07 -07:00
Ross Wightman 30ffa152de Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer. 2024-06-10 12:07:14 -07:00
Ross Wightman 5e9ff5798f Adding pos embed resize fns to FX autowrap exceptions 2024-06-10 12:06:47 -07:00
Ross Wightman f0fb471b26 Remove separate ConvNormActAa class, merge with ConvNormAct 2024-06-10 12:05:35 -07:00
Ross Wightman 5efa15b2a2 Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731 2024-06-09 16:54:48 -07:00
Ross Wightman cc8a03daac Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers 2024-06-06 09:15:27 -07:00
Ross Wightman 5fa6efa158 Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman 7fe96e7a92 More MobileNet-v4 fixes
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman 70176a2dae torchscript typing fixes 2024-05-23 11:43:05 -07:00
Ross Wightman 2a1a6b1236 Adding missing attention2d.py 2024-05-23 11:06:32 -07:00
Ross Wightman cee79dada0 Merge remote-tracking branch 'origin/main' into efficientnet_x 2024-05-23 11:01:39 -07:00
Ross Wightman 6a8bb03330 Initial MobileNetV4 pass 2024-05-23 10:49:18 -07:00
Fernando Cossio 9b11801cb4
Credit earlier work with the same idea.
Hi, this earlier work has the same name and idea behind this layer. It could be useful for readers to keep both links here if they want to see the effects of introducing this layer on a very different domain. 😄
2024-05-16 22:50:34 +02:00
Ross Wightman 211d18d8ac Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming 2024-05-11 23:37:35 -07:00
Ross Wightman 2bfa5e5d74 Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule 2024-05-06 16:32:49 -07:00
Ross Wightman 301d0bb21f Stricter check on pool_type for adaptive pooling module. Fix #2159 2024-05-03 16:16:51 -07:00
Ross Wightman 4b2565e4cb More forward_intermediates() / FeatureGetterNet work
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman d6c2cc91af Make NormMlpClassifier head reset args consistent with ClassifierHead 2024-02-10 16:25:33 -08:00
Ross Wightman 7bc7798d0e Type annotation correctness for create_act 2024-02-10 14:57:58 -08:00
Ross Wightman 88889de923 Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg 2024-01-27 13:48:33 -08:00
Ross Wightman d4386219c6 Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero 2024-01-26 16:35:51 -08:00
kalazus 7f19a4cce7 fix fast catavgmax selection 2024-01-16 10:30:08 -08:00
Ross Wightman df7ae11eb2 Add device arg for patch embed resize, fix #2024 2023-12-04 11:42:13 -08:00
Ross Wightman 9fab8d8f58 Fix break of 2 years old torchvision installs :/ 2023-11-04 02:32:09 -07:00
Ross Wightman f7762fee78 Consistency handling None / empty string inputs to norm / act create fns 2023-11-03 11:01:41 -07:00
Ross Wightman a2e4a4c148 Add quickgelu vit clip variants, simplify get_norm_layer and allow string args in vit norm/act. Add metaclip CLIP weights 2023-11-03 11:01:41 -07:00
a-r-r-o-w d5f1525334 include suggestions from review
Co-Authored-By: Ross Wightman <rwightman@gmail.com>
2023-10-30 13:47:54 -07:00
a-r-r-o-w 5f14bdd564 include typing suggestions by @rwightman 2023-10-30 13:47:54 -07:00
Laureηt fe92fd93e5 fix adaptive_avgmax_pool.py
remove extra whitespace in `SelectAdaptivePool2d`'s `__repr__`
2023-10-29 23:03:36 -07:00
Tush9905 89ba0da910 Fixed Typos
Fixed the typos in helpers.py and CONTRIBUTING.md
2023-10-21 21:46:31 -07:00
Ross Wightman 49a459e8f1 Merge remote-tracking branch 'upstream/main' into vit_siglip_and_reg 2023-10-17 09:36:48 -07:00
Ross Wightman a58f9162d7 Missed __init__.py update for attention pooling layer add 2023-10-17 09:28:21 -07:00
Ross Wightman 71365165a2 Add SigLIP weights 2023-10-16 23:26:08 -07:00
lucapericlp 7ce65a83a2 Removing unused self.drop 2023-10-05 11:20:57 -07:00
Ross Wightman 9caf32b93f Move levit style pos bias resize with other rel pos bias utils 2023-09-01 11:05:56 -07:00
方曦 170a5b6e27 add tinyvit 2023-09-01 11:05:56 -07:00
Ross Wightman fc5d705b83 dynamic_size -> dynamic_img_size, add dynamic_img_pad for padding option 2023-08-27 15:58:35 -07:00
Ross Wightman 1f4512fca3 Support dynamic_resize in eva.py models 2023-08-27 15:58:35 -07:00
Ross Wightman fdd8c7c2da Initial impl of dynamic resize for existing vit models (incl vit-resnet hybrids) 2023-08-27 15:58:35 -07:00
Ross Wightman c153cd4a3e Add more advanced interpolation method from BEiT and support non-square window & image size adaptation for
* beit/beit-v2
* maxxvit/coatnet
* swin transformer
And non-square windows for swin-v2
2023-08-08 16:41:16 -07:00