Commit Graph

1278 Commits (72f0edb7e88556dafd9d2c4bc16bd5c19f84834b)

Author SHA1 Message Date
Ross Wightman cdc7bcea69 Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool. 2024-06-11 21:32:07 -07:00
Ross Wightman c63da1405c Pretrained cfg name mismatch 2024-06-11 21:16:54 -07:00
Ross Wightman 88efca1be2 First set of MobileNetV4 weights trained in timm 2024-06-11 18:53:01 -07:00
Ross Wightman 30ffa152de Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer. 2024-06-10 12:07:14 -07:00
Ross Wightman 5e9ff5798f Adding pos embed resize fns to FX autowrap exceptions 2024-06-10 12:06:47 -07:00
Ross Wightman f0fb471b26 Remove separate ConvNormActAa class, merge with ConvNormAct 2024-06-10 12:05:35 -07:00
Ross Wightman 5efa15b2a2 Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731 2024-06-09 16:54:48 -07:00
Ross Wightman 7702d9afa1 ViTamin in_chans !=3 weight load fix 2024-06-07 20:39:23 -07:00
Ross Wightman 66a0eb4673 Experimenting with tiny test models, how small can they go and be useful for regression tests? 2024-06-07 16:09:25 -07:00
Ross Wightman 5ee06760dc Fix classifier input dim for mnv3 after last changes 2024-06-07 13:53:13 -07:00
Ross Wightman a5a2ad2e48 Fix consistency, testing for forward_head w/ pre_logits, reset_classifier, models with pre_logits size != unpooled feature size
* add test that model supports forward_head(x, pre_logits=True)
* add head_hidden_size attr to all models and set differently from num_features attr when head has hidden layers
* test forward_features() feat dim == model.num_features and pre_logits feat dim == self.head_hidden_size
* more consistency in reset_classifier signature, add typing
* asserts in some heads where pooling cannot be disabled
Fix #2194
2024-06-07 13:53:00 -07:00
Ross Wightman 4535a5412a Change default serialization for push_to_hf_hub to 'both' 2024-06-07 13:40:31 -07:00
Ross Wightman 7ccb10ebff Disable efficient_builder debug flag 2024-06-06 21:50:27 -07:00
Ross Wightman ad026e6e33 Fix in_chans switching on create 2024-06-06 17:56:14 -07:00
Ross Wightman fc1b66a51d Fix first conv name for mci vit-b 2024-06-06 13:42:26 -07:00
Ross Wightman 88a1006e02 checkpoint filter fns with consistent name, add mobileclip-b pretrained cfgs 2024-06-06 12:38:52 -07:00
Ross Wightman 7d4ada6d16 Update ViTamin model defs 2024-06-06 09:16:43 -07:00
Ross Wightman cc8a03daac Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers 2024-06-06 09:15:27 -07:00
Ross Wightman 3c9d8e5b33 Merge remote-tracking branch 'origin/efficientnet_x' into fastvit_mobileclip 2024-06-05 17:35:15 -07:00
Ross Wightman 5756a81c55 Merge remote-tracking branch 'origin/Beckschen-vitamin' into fastvit_mobileclip 2024-06-05 15:20:54 -07:00
Ross Wightman 58591a97f7 Enable features_only properly 2024-06-04 16:57:16 -07:00
Ross Wightman 1b66ec7cf3 Fixup ViTamin, add hub weight reference 2024-06-03 17:14:03 -07:00
Ross Wightman b2c0aeb0ec Merge branch 'main' of https://github.com/Beckschen/pytorch-image-models into Beckschen-vitamin 2024-06-02 14:16:30 -07:00
Ross Wightman 7f96538052 Add missing lkc act for mobileclip fastvits 2024-05-31 11:59:51 -07:00
Ross Wightman a503639bcc Add mobileclip fastvit model defs, support extra SE. Add forward_intermediates API to fastvit 2024-05-30 10:17:38 -07:00
Ross Wightman 5fa6efa158 Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman 5dce710101 Add vit_little in12k + in12k-ft-in1k weights 2024-05-27 14:56:03 -07:00
Ross Wightman 3c0283f9ef Fix reparameterize for NextViT. Fix #2187 2024-05-27 14:48:58 -07:00
Ross Wightman 4ff7c25766 Pass layer_scale_init_value to Mnv3Features module 2024-05-24 16:44:50 -07:00
Ross Wightman a12b72b5c4 Fix missing head_norm arg pop for feature model 2024-05-24 15:50:34 -07:00
Ross Wightman 7fe96e7a92 More MobileNet-v4 fixes
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman 28d76a97db Mixed up kernel size for last blocks in mnv4-conv-small 2024-05-24 11:50:42 -07:00
Ross Wightman 0c6a69e7ef Add comments to MNV4 model defs with block variants 2024-05-23 15:54:05 -07:00
Ross Wightman cb33956b20 Fix some mistakes in mnv4 model defs 2024-05-23 14:24:32 -07:00
Ross Wightman cee79dada0 Merge remote-tracking branch 'origin/main' into efficientnet_x 2024-05-23 11:01:39 -07:00
Ross Wightman 6a8bb03330 Initial MobileNetV4 pass 2024-05-23 10:49:18 -07:00
Ross Wightman e748805be3 Add regex matching support to AttentionExtract. Add return_dict support to graph extractors and use returned output in AttentionExtractor 2024-05-22 14:33:39 -07:00
Ross Wightman 84cb225ecb Add in12k + 12k_ft_in1k vit_medium weights 2024-05-20 15:52:46 -07:00
Beckschen 7a2ad6bce1 Add link to model weights on Hugging Face 2024-05-17 06:51:35 -04:00
Beckschen 530fb49e7e Add link to model weights on Hugging Face 2024-05-17 06:48:59 -04:00
Ross Wightman cd0e7b11ff
Merge pull request #2180 from yvonwin/main
Remove a duplicate function in mobilenetv3.py
2024-05-15 07:54:17 -07:00
Ross Wightman 83aee5c28c Add explicit GAP (avg pool) variants of other SigLIP models. 2024-05-15 07:53:19 -07:00
yvonwin 58f2f79b04 Remove a duplicate function in mobilenetv3.py: `_gen_lcnet` is repeated in mobilenetv3.py.Remove the duplicate code. 2024-05-15 17:59:34 +08:00
Ross Wightman 7b3b11b63f Support loading of paligemma weights into GAP variants of SigLIP ViT. Minor tweak to npz loading for packed transformer weights. 2024-05-14 15:44:37 -07:00
Beckschen df304ffbf2 the dataclass init needs to use the default factory pattern, according to Ross 2024-05-14 15:10:05 -04:00
Ross Wightman a69863ad61
Merge pull request #2156 from huggingface/hiera
WIP Hiera implementation.
2024-05-13 14:58:12 -07:00
Ross Wightman f7aa0a1a71 Add missing vit_wee weight 2024-05-13 12:05:47 -07:00
Ross Wightman 7a4e987b9f Hiera weights on hub 2024-05-13 11:43:22 -07:00
Ross Wightman 23f09af08e Merge branch 'main' into efficientnet_x 2024-05-12 21:31:08 -07:00
Ross Wightman c838c4233f Add typing to reset_classifier() on other models 2024-05-12 11:12:00 -07:00
Ross Wightman 3e03b2bf3f Fix a few more hiera API issues 2024-05-12 11:11:45 -07:00
Ross Wightman 211d18d8ac Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming 2024-05-11 23:37:35 -07:00
Ross Wightman 2ca45a4ff5 Merge remote-tracking branch 'upstream/main' into hiera 2024-05-11 15:43:05 -07:00
Ross Wightman 1d3ab176bc Remove debug / staging code 2024-05-10 22:16:34 -07:00
Ross Wightman aa4d06a11c sbb vit weights on hub, testing 2024-05-10 17:15:01 -07:00
Ross Wightman 3582ca499e Prepping weight push, benchmarking. 2024-05-10 14:14:06 -07:00
Ross Wightman 2bfa5e5d74 Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule 2024-05-06 16:32:49 -07:00
Beckschen 99d4c7d202 add ViTamin models 2024-05-05 02:50:14 -04:00
Ross Wightman 07535f408a Add AttentionExtract helper module 2024-05-04 14:10:00 -07:00
Ross Wightman 45b7ae8029 forward_intermediates() support for byob/byoanet models 2024-05-04 14:06:52 -07:00
Ross Wightman c4b8897e9e attention -> attn in davit for model consistency 2024-05-04 14:06:11 -07:00
Ross Wightman cb57a96862 Fix early stop for efficientnet/mobilenetv3 fwd inter. Fix indices typing for all fwd inter. 2024-05-04 10:21:58 -07:00
Ross Wightman 01dd01b70e forward_intermediates() for MlpMixer models and RegNet. 2024-05-04 10:21:03 -07:00
Ross Wightman f8979d4f50 Comment out time local files while testing new vit weights 2024-05-03 20:26:56 -07:00
Ross Wightman c719f7eb86 More forward_intermediates() updates
* add convnext, resnet, efficientformer, levit support
* remove kwargs only for fn so that torchscript isn't broken for all :(
* use reset_classifier() consistently in prune
2024-05-03 16:22:32 -07:00
Ross Wightman d6da4fb01e Add forward_intermediates() to efficientnet / mobilenetv3 based models as an exercise. 2024-05-02 14:19:16 -07:00
Ross Wightman c22efb9765 Add wee & little vits for some experiments 2024-05-02 10:51:35 -07:00
Ross Wightman 67332fce24 Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights. 2024-04-30 16:56:33 -07:00
user-miner1 740f4983b3 Assert messages added 2024-04-30 10:10:02 +03:00
Ross Wightman c6db4043cd Update forward_intermediates for hiera to have its own fwd impl w/ early stopping. Remove return_intermediates bool from forward(). Still an fx issue with None mask arg :( 2024-04-29 17:23:37 -07:00
Ross Wightman 9b9a356a04 Add forward_intermediates support for xcit, cait, and volo. 2024-04-29 16:30:45 -07:00
Ross Wightman ef147fd2fb Add forward_intermediates API to Hiera for features_only=True support 2024-04-21 11:30:41 -07:00
Ross Wightman d88bed6535 Bit more Hiera fiddling 2024-04-21 09:36:57 -07:00
Ross Wightman 8a54d2a930 WIP Hiera implementation. Fix #2083. Trying to get image size adaptation to work. 2024-04-20 09:47:17 -07:00
Ross Wightman d6b95520f1
Merge pull request #2136 from huggingface/vit_features_only
Exploring vit features_only via new forward_intermediates() API, inspired by #2131
2024-04-11 08:38:20 -07:00
Ross Wightman 4b2565e4cb More forward_intermediates() / FeatureGetterNet work
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman ef9c6fb846 forward_head(), consistent pre_logits handling to reduce likelihood of people manually replacing .head module having issues 2024-04-09 21:54:59 -07:00
Ross Wightman 679daef76a More forward_intermediates() & features_only work
* forward_intermediates() added to beit, deit, eva, mvitv2, twins, vit, vit_sam
* add features_only to forward intermediates to allow just intermediate features
* fix #2060
* fix #1374
* fix #657
2024-04-09 21:29:16 -07:00
Ross Wightman 17b892f703 Fix #2139, disable strict weight loading when head changes from classification 2024-04-09 08:41:37 -07:00
Ross Wightman 5fdc0b4e93 Exploring vit features_only using get_intermediate_layers() as per #2131 2024-04-07 11:24:45 -07:00
Ross Wightman 34b41b143c Fiddling with efficientnet x/h defs, is it worth adding & training any? 2024-03-22 17:55:02 -07:00
Ross Wightman c559c3911f Improve vit conversions. OpenAI convert pass through main convert for patch & pos resize. Fix #2120 2024-03-21 10:00:43 -07:00
Ross Wightman 256cf19148 Rename tinyclip models to fit existing 'clip' variants, use consistently mapped OpenCLIP compatible checkpoint on hf hub 2024-03-20 15:21:46 -07:00
Thien Tran 1a1d07d479 add other tinyclip 2024-03-19 07:27:09 +08:00
Thien Tran dfffffac55 add tinyclip 8m 2024-03-19 07:02:17 +08:00
Ross Wightman 6ccb7d6a7c
Merge pull request #2111 from jamesljlster/enhance_vit_get_intermediate_layers
Vision Transformer (ViT) get_intermediate_layers: enhanced to support dynamic image size and saved computational costs from unused blocks
2024-03-18 13:41:18 -07:00
Cheng-Ling Lai db06b56d34
Saved computational costs of get_intermediate_layers() from unused blocks 2024-03-17 21:34:06 +08:00
Cheng-Ling Lai 4731e4efc4
Modified ViT get_intermediate_layers() to support dynamic image size 2024-03-16 23:07:21 +08:00
SmilingWolf 59cb0be595 SwinV2: add configurable act_layer argument
Defaults to "gelu", but makes it possible to pass "gelu_tanh".
Makes it easier to port weights from JAX/Flax, where the tanh
approximation is the default.
2024-03-05 22:04:17 +01:00
Ross Wightman 31e0dc0a5d Tweak hgnet before merge 2024-02-12 15:00:32 -08:00
Ross Wightman 3e03491e49 Merge branch 'master' of https://github.com/seefun/pytorch-image-models into seefun-master 2024-02-12 14:59:54 -08:00
Ross Wightman 59239d9df5 Cleanup imports for vit relpos 2024-02-10 21:40:57 -08:00
Ross Wightman ac1b08deb6 fix_init on vit & relpos vit 2024-02-10 20:15:37 -08:00
Ross Wightman 935950cc11 Fix F.sdpa attn drop prob 2024-02-10 20:14:47 -08:00
Ross Wightman 0737cf231d Add Next-ViT 2024-02-10 17:05:16 -08:00
Ross Wightman d6c2cc91af Make NormMlpClassifier head reset args consistent with ClassifierHead 2024-02-10 16:25:33 -08:00
Ross Wightman 87fec3dc14 Update experimental vit model configs 2024-02-10 16:05:58 -08:00
Ross Wightman 7d3c2dc993 Add group_matcher for DaViT 2024-02-10 14:58:45 -08:00
Ross Wightman 88889de923 Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg 2024-01-27 13:48:33 -08:00
Ross Wightman d4386219c6 Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero 2024-01-26 16:35:51 -08:00
Ross Wightman 3234daf783 Add missing deprecation mapping for a densenet and xcit model. Fix #2086. Tweak xcit pos embed use of arange for better low prec safety. 2024-01-24 22:04:04 -08:00
Li zhuoqun 53a4888328 Add droppath and type hint to Xception. 2024-01-19 11:15:47 -08:00
方曦 9dbea3bef6 fix cls head in hgnet 2023-12-27 21:26:26 +08:00
SeeFun 56ae8b906d
fix reset head in hgnet 2023-12-27 20:11:29 +08:00
SeeFun 6862c9850a
fix backward in hgnet 2023-12-27 16:49:37 +08:00
SeeFun 6cd28bc5c2
Merge branch 'huggingface:main' into master 2023-12-27 16:43:37 +08:00
Ross Wightman f2fdd97e9f Add parsable json results output for train.py, tweak --pretrained-path to force head adaptation 2023-12-22 11:18:25 -08:00
LR e0079c92da
Update eva.py (#2058)
* Update eva.py

When argument class token = False, self.cls_token = None.

Prevents error from attempting trunc_normal_ on None:
AttributeError: 'NoneType' object has no attribute 'uniform_'

* Update eva.py

fix
2023-12-16 15:10:45 -08:00
Li zhuoqun 7da34a999a add type annotations in the code of swin_transformer_v2 2023-12-15 09:31:25 -08:00
Fredo Guan bbe798317f
Update EdgeNeXt to use ClassifierHead as per ConvNeXt (#2051)
* Update edgenext.py
2023-12-11 12:17:19 -08:00
Ross Wightman 60b170b200 Add --pretrained-path arg to train script to allow passing local checkpoint as pretrained. Add missing/unexpected keys log. 2023-12-11 12:10:29 -08:00
Fredo Guan 2597ce2860 Update davit.py 2023-12-11 11:13:04 -08:00
akiyuki ishikawa 2bd043ce5d fix doc position 2023-12-05 12:00:51 -08:00
akiyuki ishikawa 4f2e1bf4cb Add missing docs in SwinTransformerStage 2023-12-05 12:00:51 -08:00
Ross Wightman cd8d9d9ff3 Add missing hf hub entries for mvitv2 2023-11-26 21:06:39 -08:00
Ross Wightman b996c1a0f5 A few more missed hf hub entries 2023-11-23 21:48:14 -08:00
Ross Wightman 89ec91aece Add missing hf_hub entry for mobilnetv3_rw 2023-11-23 12:44:59 -08:00
Dillon Laird 63ee54853c fixed intermediate output indices 2023-11-22 16:32:41 -08:00
Ross Wightman fa06f6c481 Merge branch 'seefun-efficientvit' 2023-11-21 14:06:27 -08:00
Ross Wightman c6b0c98963 Upload weights to hub, tweak crop_pct, comment out SAM EfficientViTs for now (v2 weights comming) 2023-11-21 14:05:04 -08:00
Ross Wightman ada145b016 Literal use w/ python < 3.8 requires typing_extension, cach instead of check sys ver 2023-11-21 09:48:03 -08:00
Ross Wightman dfaab97d20 More consistency in model arg/kwarg merge handling 2023-11-21 09:48:03 -08:00
Ross Wightman 3775e4984f Merge branch 'efficientvit' of github.com:seefun/pytorch-image-models into seefun-efficientvit 2023-11-20 16:21:38 -08:00
Ross Wightman dfb8658100 Fix a few missed model deprecations and one missed pretrained cfg 2023-11-20 12:41:49 -08:00
Ross Wightman a604011935 Add support for passing model args via hf hub config 2023-11-19 15:16:01 -08:00
方曦 c9d093a58e update norm eps for efficientvit large 2023-11-18 17:46:47 +08:00
Laureηt 21647c0a0c Add types to vision_transformers.py 2023-11-17 16:06:06 -08:00
方曦 87ba43a9bc add efficientvit large series 2023-11-17 13:58:46 +08:00
Ross Wightman 7c685a4ef3 Fix openai quickgelu loading and add mnissing orig_in21k vit weights and remove zero'd classifier w/ matching hub update 2023-11-16 19:16:28 -08:00
LittleNyima ef72c3cd47 Add warnings for duplicate registry names 2023-11-08 10:18:59 -08:00
Ross Wightman d3e83a190f Add in12k fine-tuned convnext_xxlarge 2023-11-03 14:35:01 -07:00
Ross Wightman dcfdba1f5f Make quickgelu models appear in listing 2023-11-03 11:01:41 -07:00
Ross Wightman 96bd162ddb Add cc-by-nc-4.0 license for metaclip, make note in quickgelu model def about pretrained_cfg mapping 2023-11-03 11:01:41 -07:00
Ross Wightman 6894ec7edc Forgot about datcomp b32 models 2023-11-03 11:01:41 -07:00
Ross Wightman a2e4a4c148 Add quickgelu vit clip variants, simplify get_norm_layer and allow string args in vit norm/act. Add metaclip CLIP weights 2023-11-03 11:01:41 -07:00
Ross Wightman c55bc41a42 DFN CLIP ViT support 2023-10-31 12:16:21 -07:00
a-r-r-o-w d5f1525334 include suggestions from review
Co-Authored-By: Ross Wightman <rwightman@gmail.com>
2023-10-30 13:47:54 -07:00
a-r-r-o-w 5f14bdd564 include typing suggestions by @rwightman 2023-10-30 13:47:54 -07:00
a-r-r-o-w 05b0aaca51 improvement: add typehints and docs to timm/models/resnet.py 2023-10-30 13:47:54 -07:00
a-r-r-o-w c2fe0a2268 improvement: add typehints and docs to timm/models/mobilenetv3.py 2023-10-30 13:47:54 -07:00
Laureηt d023154bb5 Update swin_transformer.py
make `SwimTransformer`'s `patch_embed` customizable through the constructor
2023-10-30 13:47:14 -07:00
Ross Wightman 68a121402f Added hub weights for dinov2 register models 2023-10-29 23:03:48 -07:00
Ross Wightman 3f02392488 Add DINOv2 models with register tokens. Convert pos embed to non-overlapping for consistency. 2023-10-29 23:03:48 -07:00
Patrick Labatut 97450d618a Update DINOv2 license to Apache 2.0 2023-10-27 09:12:51 -07:00
mjamroz 7a6369156f avoid getting undefined 2023-10-22 21:36:23 -07:00
pUmpKin-Co 8556462a18 fix doc typo in resnetv2 2023-10-20 11:56:50 -07:00
Ross Wightman 462fb3ec9f Push new repvit weights to hub, tweak tag names 2023-10-20 11:49:29 -07:00
Ross Wightman 5309424d5e Merge branch 'main' of https://github.com/jameslahm/pytorch-image-models into jameslahm-main 2023-10-20 11:08:12 -07:00
Ross Wightman d3ebdcfd93 Disable strict load when siglip vit pooling removed 2023-10-19 12:03:40 -07:00
Ross Wightman e728f3efdb Cleanup ijepa models, they're just gap (global-avg-pool) models w/o heads. fc-norm conversion was wrong, gigantic should have been giant 2023-10-17 15:44:46 -07:00