Commit Graph

1225 Commits (10344625bea17750fb116acfde11100e96e70879)

Author SHA1 Message Date
Ross Wightman 30ffa152de Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer. 2024-06-10 12:07:14 -07:00
Ross Wightman 5e9ff5798f Adding pos embed resize fns to FX autowrap exceptions 2024-06-10 12:06:47 -07:00
Ross Wightman f0fb471b26 Remove separate ConvNormActAa class, merge with ConvNormAct 2024-06-10 12:05:35 -07:00
Ross Wightman 5efa15b2a2 Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731 2024-06-09 16:54:48 -07:00
Ross Wightman 7702d9afa1 ViTamin in_chans !=3 weight load fix 2024-06-07 20:39:23 -07:00
Ross Wightman 66a0eb4673 Experimenting with tiny test models, how small can they go and be useful for regression tests? 2024-06-07 16:09:25 -07:00
Ross Wightman 5ee06760dc Fix classifier input dim for mnv3 after last changes 2024-06-07 13:53:13 -07:00
Ross Wightman a5a2ad2e48 Fix consistency, testing for forward_head w/ pre_logits, reset_classifier, models with pre_logits size != unpooled feature size
* add test that model supports forward_head(x, pre_logits=True)
* add head_hidden_size attr to all models and set differently from num_features attr when head has hidden layers
* test forward_features() feat dim == model.num_features and pre_logits feat dim == self.head_hidden_size
* more consistency in reset_classifier signature, add typing
* asserts in some heads where pooling cannot be disabled
Fix #2194
2024-06-07 13:53:00 -07:00
Ross Wightman 4535a5412a Change default serialization for push_to_hf_hub to 'both' 2024-06-07 13:40:31 -07:00
Ross Wightman 7ccb10ebff Disable efficient_builder debug flag 2024-06-06 21:50:27 -07:00
Ross Wightman ad026e6e33 Fix in_chans switching on create 2024-06-06 17:56:14 -07:00
Ross Wightman fc1b66a51d Fix first conv name for mci vit-b 2024-06-06 13:42:26 -07:00
Ross Wightman 88a1006e02 checkpoint filter fns with consistent name, add mobileclip-b pretrained cfgs 2024-06-06 12:38:52 -07:00
Ross Wightman 7d4ada6d16 Update ViTamin model defs 2024-06-06 09:16:43 -07:00
Ross Wightman cc8a03daac Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers 2024-06-06 09:15:27 -07:00
Ross Wightman 3c9d8e5b33 Merge remote-tracking branch 'origin/efficientnet_x' into fastvit_mobileclip 2024-06-05 17:35:15 -07:00
Ross Wightman 5756a81c55 Merge remote-tracking branch 'origin/Beckschen-vitamin' into fastvit_mobileclip 2024-06-05 15:20:54 -07:00
Ross Wightman 58591a97f7 Enable features_only properly 2024-06-04 16:57:16 -07:00
Ross Wightman 1b66ec7cf3 Fixup ViTamin, add hub weight reference 2024-06-03 17:14:03 -07:00
Ross Wightman b2c0aeb0ec Merge branch 'main' of https://github.com/Beckschen/pytorch-image-models into Beckschen-vitamin 2024-06-02 14:16:30 -07:00
Ross Wightman 7f96538052 Add missing lkc act for mobileclip fastvits 2024-05-31 11:59:51 -07:00
Ross Wightman a503639bcc Add mobileclip fastvit model defs, support extra SE. Add forward_intermediates API to fastvit 2024-05-30 10:17:38 -07:00
Ross Wightman 5fa6efa158 Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman 5dce710101 Add vit_little in12k + in12k-ft-in1k weights 2024-05-27 14:56:03 -07:00
Ross Wightman 3c0283f9ef Fix reparameterize for NextViT. Fix #2187 2024-05-27 14:48:58 -07:00
Ross Wightman 4ff7c25766 Pass layer_scale_init_value to Mnv3Features module 2024-05-24 16:44:50 -07:00
Ross Wightman a12b72b5c4 Fix missing head_norm arg pop for feature model 2024-05-24 15:50:34 -07:00
Ross Wightman 7fe96e7a92 More MobileNet-v4 fixes
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman 28d76a97db Mixed up kernel size for last blocks in mnv4-conv-small 2024-05-24 11:50:42 -07:00
Ross Wightman 0c6a69e7ef Add comments to MNV4 model defs with block variants 2024-05-23 15:54:05 -07:00
Ross Wightman cb33956b20 Fix some mistakes in mnv4 model defs 2024-05-23 14:24:32 -07:00
Ross Wightman cee79dada0 Merge remote-tracking branch 'origin/main' into efficientnet_x 2024-05-23 11:01:39 -07:00
Ross Wightman 6a8bb03330 Initial MobileNetV4 pass 2024-05-23 10:49:18 -07:00
Ross Wightman e748805be3 Add regex matching support to AttentionExtract. Add return_dict support to graph extractors and use returned output in AttentionExtractor 2024-05-22 14:33:39 -07:00
Ross Wightman 84cb225ecb Add in12k + 12k_ft_in1k vit_medium weights 2024-05-20 15:52:46 -07:00
Beckschen 7a2ad6bce1 Add link to model weights on Hugging Face 2024-05-17 06:51:35 -04:00
Beckschen 530fb49e7e Add link to model weights on Hugging Face 2024-05-17 06:48:59 -04:00
Ross Wightman cd0e7b11ff
Merge pull request #2180 from yvonwin/main
Remove a duplicate function in mobilenetv3.py
2024-05-15 07:54:17 -07:00
Ross Wightman 83aee5c28c Add explicit GAP (avg pool) variants of other SigLIP models. 2024-05-15 07:53:19 -07:00
yvonwin 58f2f79b04 Remove a duplicate function in mobilenetv3.py: `_gen_lcnet` is repeated in mobilenetv3.py.Remove the duplicate code. 2024-05-15 17:59:34 +08:00
Ross Wightman 7b3b11b63f Support loading of paligemma weights into GAP variants of SigLIP ViT. Minor tweak to npz loading for packed transformer weights. 2024-05-14 15:44:37 -07:00
Beckschen df304ffbf2 the dataclass init needs to use the default factory pattern, according to Ross 2024-05-14 15:10:05 -04:00
Ross Wightman a69863ad61
Merge pull request #2156 from huggingface/hiera
WIP Hiera implementation.
2024-05-13 14:58:12 -07:00
Ross Wightman f7aa0a1a71 Add missing vit_wee weight 2024-05-13 12:05:47 -07:00
Ross Wightman 7a4e987b9f Hiera weights on hub 2024-05-13 11:43:22 -07:00
Ross Wightman 23f09af08e Merge branch 'main' into efficientnet_x 2024-05-12 21:31:08 -07:00
Ross Wightman c838c4233f Add typing to reset_classifier() on other models 2024-05-12 11:12:00 -07:00
Ross Wightman 3e03b2bf3f Fix a few more hiera API issues 2024-05-12 11:11:45 -07:00
Ross Wightman 211d18d8ac Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming 2024-05-11 23:37:35 -07:00
Ross Wightman 2ca45a4ff5 Merge remote-tracking branch 'upstream/main' into hiera 2024-05-11 15:43:05 -07:00
Ross Wightman 1d3ab176bc Remove debug / staging code 2024-05-10 22:16:34 -07:00
Ross Wightman aa4d06a11c sbb vit weights on hub, testing 2024-05-10 17:15:01 -07:00
Ross Wightman 3582ca499e Prepping weight push, benchmarking. 2024-05-10 14:14:06 -07:00
Ross Wightman 2bfa5e5d74 Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule 2024-05-06 16:32:49 -07:00
Beckschen 99d4c7d202 add ViTamin models 2024-05-05 02:50:14 -04:00
Ross Wightman 07535f408a Add AttentionExtract helper module 2024-05-04 14:10:00 -07:00
Ross Wightman 45b7ae8029 forward_intermediates() support for byob/byoanet models 2024-05-04 14:06:52 -07:00
Ross Wightman c4b8897e9e attention -> attn in davit for model consistency 2024-05-04 14:06:11 -07:00
Ross Wightman cb57a96862 Fix early stop for efficientnet/mobilenetv3 fwd inter. Fix indices typing for all fwd inter. 2024-05-04 10:21:58 -07:00
Ross Wightman 01dd01b70e forward_intermediates() for MlpMixer models and RegNet. 2024-05-04 10:21:03 -07:00
Ross Wightman f8979d4f50 Comment out time local files while testing new vit weights 2024-05-03 20:26:56 -07:00
Ross Wightman c719f7eb86 More forward_intermediates() updates
* add convnext, resnet, efficientformer, levit support
* remove kwargs only for fn so that torchscript isn't broken for all :(
* use reset_classifier() consistently in prune
2024-05-03 16:22:32 -07:00
Ross Wightman d6da4fb01e Add forward_intermediates() to efficientnet / mobilenetv3 based models as an exercise. 2024-05-02 14:19:16 -07:00
Ross Wightman c22efb9765 Add wee & little vits for some experiments 2024-05-02 10:51:35 -07:00
Ross Wightman 67332fce24 Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights. 2024-04-30 16:56:33 -07:00
user-miner1 740f4983b3 Assert messages added 2024-04-30 10:10:02 +03:00
Ross Wightman c6db4043cd Update forward_intermediates for hiera to have its own fwd impl w/ early stopping. Remove return_intermediates bool from forward(). Still an fx issue with None mask arg :( 2024-04-29 17:23:37 -07:00
Ross Wightman 9b9a356a04 Add forward_intermediates support for xcit, cait, and volo. 2024-04-29 16:30:45 -07:00
Ross Wightman ef147fd2fb Add forward_intermediates API to Hiera for features_only=True support 2024-04-21 11:30:41 -07:00
Ross Wightman d88bed6535 Bit more Hiera fiddling 2024-04-21 09:36:57 -07:00
Ross Wightman 8a54d2a930 WIP Hiera implementation. Fix #2083. Trying to get image size adaptation to work. 2024-04-20 09:47:17 -07:00
Ross Wightman d6b95520f1
Merge pull request #2136 from huggingface/vit_features_only
Exploring vit features_only via new forward_intermediates() API, inspired by #2131
2024-04-11 08:38:20 -07:00
Ross Wightman 4b2565e4cb More forward_intermediates() / FeatureGetterNet work
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman ef9c6fb846 forward_head(), consistent pre_logits handling to reduce likelihood of people manually replacing .head module having issues 2024-04-09 21:54:59 -07:00
Ross Wightman 679daef76a More forward_intermediates() & features_only work
* forward_intermediates() added to beit, deit, eva, mvitv2, twins, vit, vit_sam
* add features_only to forward intermediates to allow just intermediate features
* fix #2060
* fix #1374
* fix #657
2024-04-09 21:29:16 -07:00
Ross Wightman 17b892f703 Fix #2139, disable strict weight loading when head changes from classification 2024-04-09 08:41:37 -07:00
Ross Wightman 5fdc0b4e93 Exploring vit features_only using get_intermediate_layers() as per #2131 2024-04-07 11:24:45 -07:00
Ross Wightman 34b41b143c Fiddling with efficientnet x/h defs, is it worth adding & training any? 2024-03-22 17:55:02 -07:00
Ross Wightman c559c3911f Improve vit conversions. OpenAI convert pass through main convert for patch & pos resize. Fix #2120 2024-03-21 10:00:43 -07:00
Ross Wightman 256cf19148 Rename tinyclip models to fit existing 'clip' variants, use consistently mapped OpenCLIP compatible checkpoint on hf hub 2024-03-20 15:21:46 -07:00
Thien Tran 1a1d07d479 add other tinyclip 2024-03-19 07:27:09 +08:00
Thien Tran dfffffac55 add tinyclip 8m 2024-03-19 07:02:17 +08:00
Ross Wightman 6ccb7d6a7c
Merge pull request #2111 from jamesljlster/enhance_vit_get_intermediate_layers
Vision Transformer (ViT) get_intermediate_layers: enhanced to support dynamic image size and saved computational costs from unused blocks
2024-03-18 13:41:18 -07:00
Cheng-Ling Lai db06b56d34
Saved computational costs of get_intermediate_layers() from unused blocks 2024-03-17 21:34:06 +08:00
Cheng-Ling Lai 4731e4efc4
Modified ViT get_intermediate_layers() to support dynamic image size 2024-03-16 23:07:21 +08:00
SmilingWolf 59cb0be595 SwinV2: add configurable act_layer argument
Defaults to "gelu", but makes it possible to pass "gelu_tanh".
Makes it easier to port weights from JAX/Flax, where the tanh
approximation is the default.
2024-03-05 22:04:17 +01:00
Ross Wightman 31e0dc0a5d Tweak hgnet before merge 2024-02-12 15:00:32 -08:00
Ross Wightman 3e03491e49 Merge branch 'master' of https://github.com/seefun/pytorch-image-models into seefun-master 2024-02-12 14:59:54 -08:00
Ross Wightman 59239d9df5 Cleanup imports for vit relpos 2024-02-10 21:40:57 -08:00
Ross Wightman ac1b08deb6 fix_init on vit & relpos vit 2024-02-10 20:15:37 -08:00
Ross Wightman 935950cc11 Fix F.sdpa attn drop prob 2024-02-10 20:14:47 -08:00
Ross Wightman 0737cf231d Add Next-ViT 2024-02-10 17:05:16 -08:00
Ross Wightman d6c2cc91af Make NormMlpClassifier head reset args consistent with ClassifierHead 2024-02-10 16:25:33 -08:00
Ross Wightman 87fec3dc14 Update experimental vit model configs 2024-02-10 16:05:58 -08:00
Ross Wightman 7d3c2dc993 Add group_matcher for DaViT 2024-02-10 14:58:45 -08:00
Ross Wightman 88889de923 Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg 2024-01-27 13:48:33 -08:00
Ross Wightman d4386219c6 Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero 2024-01-26 16:35:51 -08:00
Ross Wightman 3234daf783 Add missing deprecation mapping for a densenet and xcit model. Fix #2086. Tweak xcit pos embed use of arange for better low prec safety. 2024-01-24 22:04:04 -08:00
Li zhuoqun 53a4888328 Add droppath and type hint to Xception. 2024-01-19 11:15:47 -08:00
方曦 9dbea3bef6 fix cls head in hgnet 2023-12-27 21:26:26 +08:00