Feraidoon Mehri
4cca568bd8
eva.py: fixed bug in applying attention mask
...
The mask should be applied before the softmax.
2024-07-19 15:12:04 +03:30
Ross Wightman
3a8a965891
Implement absolute+window pos embed for hiera, resizable but needs new weights
2024-07-18 21:43:37 -07:00
Ross Wightman
392b78aee7
set_input_size initial impl for vit & swin v1. Move HybridEmbed to own location in timm/layers
2024-07-17 15:25:48 -07:00
Promisery
417cf7f871
Initialize weights of reg_token for ViT
2024-07-13 11:11:42 +08:00
Ross Wightman
f920119f3b
Fixing tests
2024-07-09 14:53:20 -07:00
Ross Wightman
644abf9588
Fix default_cfg test for mobilenet_100
2024-07-09 12:52:24 -07:00
Ross Wightman
d5afe106dc
Merge remote-tracking branch 'origin/tiny_test_models' into small_things
2024-07-09 12:49:57 -07:00
Ross Wightman
55101028bb
Rename test_tiny* -> test*. Fix ByobNet BasicBlock attn location and add test_byobnet model.
2024-07-09 11:53:11 -07:00
Ross Wightman
1334598462
Add support back to EfficientNet to disable head_conv / bn2 so mobilnetv1 can be implemented properly
2024-07-08 13:51:26 -07:00
Ross Wightman
800405d941
Add conv_large mobilenetv3 aa/blur model defs
2024-07-08 13:50:05 -07:00
Ross Wightman
f81b094aaa
Add 'qkv_bias_separate' flag for EVA/beit/swinv2 attn modules to allow an override for easy quantization wrappers. Fix #2098
2024-07-08 13:48:38 -07:00
Steffen Schneider
c01a47c9e7
Fix typo in type annotations in timm.models.hrnet
2024-07-08 00:53:16 +02:00
Daniel Suess
197c10463b
Fix jit.script breaking with features_fx
2024-06-28 03:58:51 +00:00
Ross Wightman
b751da692d
Add latest ix (xavier init for mqa) hybrid medium & large weights for MobileNetV4
2024-06-24 13:49:55 -07:00
Ross Wightman
f8342a045a
Merge pull request #2213 from huggingface/florence2
...
Fix #2212 map florence2 image tower to davit with a few changes
2024-06-24 11:01:08 -07:00
Sejik
c33a001397
Fix typo
2024-06-24 21:54:38 +09:00
Ross Wightman
02d0f27721
cleanup davit padding
2024-06-22 12:06:46 -07:00
Ross Wightman
c715c724e7
Fix tracing by removing float cast, should end up float anyways
2024-06-22 08:35:30 -07:00
Ross Wightman
fb58a73033
Fix #2212 map florence2 image tower to davit with a few changes
2024-06-21 15:31:29 -07:00
Ross Wightman
fb13e6385e
Merge pull request #2203 from huggingface/more_mobile
...
Add mobilenet edgetpu defs for exp, add ol mobilenet v1 back for comp…
2024-06-18 15:20:01 -07:00
Ross Wightman
16e082e1c2
Add mobilenetv4 hybrid-large weights
2024-06-17 11:08:31 -07:00
Ross Wightman
e41125cc83
Merge pull request #2209 from huggingface/fcossio-vit-maxpool
...
ViT pooling refactor
2024-06-17 07:51:12 -07:00
Ross Wightman
a22466852d
Add 2400 epoch mobilenetv4 small weights, almost at paper, rounds to 73.8
2024-06-16 10:51:00 -07:00
Ross Wightman
b1a6f4a946
Some missed reset_classifier() type annotations
2024-06-16 10:39:27 -07:00
Ross Wightman
71101ebba0
Refactor vit pooling to add more reduction options, separately callable
2024-06-14 23:16:58 -07:00
Ross Wightman
a0bb5b4a44
Missing stem_kernel_size argument in EfficientNetFeatures
2024-06-14 13:39:31 -07:00
Fernando Cossio
9567cf6d84
Feature: add option global_pool='max' to VisionTransformer
...
Most of the CNNs have a max global pooling option. I would like to extend ViT to have this option.
2024-06-14 15:24:54 +02:00
Ross Wightman
9613c76844
Add mobilenet edgetpu defs for exp, add ol mobilenet v1 back for completeness / comparison
2024-06-13 17:33:04 -07:00
Ross Wightman
22de845add
Prepping for final MobileCLIP weight locations ( #2199 )
...
* Prepping for final MobileCLIP weight locations
* Update weight locations to coreml-projects
* Update mobileclip weight locations with final apple org location
2024-06-13 16:55:49 -07:00
Ross Wightman
575978ba55
Add mnv4_conv_large 384x384 weight location
2024-06-13 12:58:04 -07:00
Ross Wightman
e42e453128
Fix mmnv4 conv_large weight link, reorder mnv4 pretrained cfg for proper precedence
2024-06-12 11:16:49 -07:00
Ross Wightman
7b0a5321cb
Merge pull request #2198 from huggingface/openai_clip_resnet
...
Mapping OpenAI CLIP Modified ResNet weights -> ByobNet.
2024-06-12 09:33:30 -07:00
Ross Wightman
57adc1acc8
Fix rotary embed version of attn pool. Bit of cleanup/naming
2024-06-11 23:49:17 -07:00
Ross Wightman
cdc7bcea69
Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool.
2024-06-11 21:32:07 -07:00
Ross Wightman
c63da1405c
Pretrained cfg name mismatch
2024-06-11 21:16:54 -07:00
Ross Wightman
88efca1be2
First set of MobileNetV4 weights trained in timm
2024-06-11 18:53:01 -07:00
Ross Wightman
30ffa152de
Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer.
2024-06-10 12:07:14 -07:00
Ross Wightman
5e9ff5798f
Adding pos embed resize fns to FX autowrap exceptions
2024-06-10 12:06:47 -07:00
Ross Wightman
f0fb471b26
Remove separate ConvNormActAa class, merge with ConvNormAct
2024-06-10 12:05:35 -07:00
Ross Wightman
5efa15b2a2
Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731
2024-06-09 16:54:48 -07:00
Ross Wightman
7702d9afa1
ViTamin in_chans !=3 weight load fix
2024-06-07 20:39:23 -07:00
Ross Wightman
66a0eb4673
Experimenting with tiny test models, how small can they go and be useful for regression tests?
2024-06-07 16:09:25 -07:00
Ross Wightman
5ee06760dc
Fix classifier input dim for mnv3 after last changes
2024-06-07 13:53:13 -07:00
Ross Wightman
a5a2ad2e48
Fix consistency, testing for forward_head w/ pre_logits, reset_classifier, models with pre_logits size != unpooled feature size
...
* add test that model supports forward_head(x, pre_logits=True)
* add head_hidden_size attr to all models and set differently from num_features attr when head has hidden layers
* test forward_features() feat dim == model.num_features and pre_logits feat dim == self.head_hidden_size
* more consistency in reset_classifier signature, add typing
* asserts in some heads where pooling cannot be disabled
Fix #2194
2024-06-07 13:53:00 -07:00
Ross Wightman
4535a5412a
Change default serialization for push_to_hf_hub to 'both'
2024-06-07 13:40:31 -07:00
Ross Wightman
7ccb10ebff
Disable efficient_builder debug flag
2024-06-06 21:50:27 -07:00
Ross Wightman
ad026e6e33
Fix in_chans switching on create
2024-06-06 17:56:14 -07:00
Ross Wightman
fc1b66a51d
Fix first conv name for mci vit-b
2024-06-06 13:42:26 -07:00
Ross Wightman
88a1006e02
checkpoint filter fns with consistent name, add mobileclip-b pretrained cfgs
2024-06-06 12:38:52 -07:00
Ross Wightman
7d4ada6d16
Update ViTamin model defs
2024-06-06 09:16:43 -07:00
Ross Wightman
cc8a03daac
Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers
2024-06-06 09:15:27 -07:00
Ross Wightman
3c9d8e5b33
Merge remote-tracking branch 'origin/efficientnet_x' into fastvit_mobileclip
2024-06-05 17:35:15 -07:00
Ross Wightman
5756a81c55
Merge remote-tracking branch 'origin/Beckschen-vitamin' into fastvit_mobileclip
2024-06-05 15:20:54 -07:00
Ross Wightman
58591a97f7
Enable features_only properly
2024-06-04 16:57:16 -07:00
Ross Wightman
1b66ec7cf3
Fixup ViTamin, add hub weight reference
2024-06-03 17:14:03 -07:00
Ross Wightman
b2c0aeb0ec
Merge branch 'main' of https://github.com/Beckschen/pytorch-image-models into Beckschen-vitamin
2024-06-02 14:16:30 -07:00
Ross Wightman
7f96538052
Add missing lkc act for mobileclip fastvits
2024-05-31 11:59:51 -07:00
Ross Wightman
a503639bcc
Add mobileclip fastvit model defs, support extra SE. Add forward_intermediates API to fastvit
2024-05-30 10:17:38 -07:00
Ross Wightman
5fa6efa158
Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
...
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman
5dce710101
Add vit_little in12k + in12k-ft-in1k weights
2024-05-27 14:56:03 -07:00
Ross Wightman
3c0283f9ef
Fix reparameterize for NextViT. Fix #2187
2024-05-27 14:48:58 -07:00
Ross Wightman
4ff7c25766
Pass layer_scale_init_value to Mnv3Features module
2024-05-24 16:44:50 -07:00
Ross Wightman
a12b72b5c4
Fix missing head_norm arg pop for feature model
2024-05-24 15:50:34 -07:00
Ross Wightman
7fe96e7a92
More MobileNet-v4 fixes
...
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman
28d76a97db
Mixed up kernel size for last blocks in mnv4-conv-small
2024-05-24 11:50:42 -07:00
Ross Wightman
0c6a69e7ef
Add comments to MNV4 model defs with block variants
2024-05-23 15:54:05 -07:00
Ross Wightman
cb33956b20
Fix some mistakes in mnv4 model defs
2024-05-23 14:24:32 -07:00
Ross Wightman
cee79dada0
Merge remote-tracking branch 'origin/main' into efficientnet_x
2024-05-23 11:01:39 -07:00
Ross Wightman
6a8bb03330
Initial MobileNetV4 pass
2024-05-23 10:49:18 -07:00
Ross Wightman
e748805be3
Add regex matching support to AttentionExtract. Add return_dict support to graph extractors and use returned output in AttentionExtractor
2024-05-22 14:33:39 -07:00
Ross Wightman
84cb225ecb
Add in12k + 12k_ft_in1k vit_medium weights
2024-05-20 15:52:46 -07:00
Beckschen
7a2ad6bce1
Add link to model weights on Hugging Face
2024-05-17 06:51:35 -04:00
Beckschen
530fb49e7e
Add link to model weights on Hugging Face
2024-05-17 06:48:59 -04:00
Ross Wightman
cd0e7b11ff
Merge pull request #2180 from yvonwin/main
...
Remove a duplicate function in mobilenetv3.py
2024-05-15 07:54:17 -07:00
Ross Wightman
83aee5c28c
Add explicit GAP (avg pool) variants of other SigLIP models.
2024-05-15 07:53:19 -07:00
yvonwin
58f2f79b04
Remove a duplicate function in mobilenetv3.py: `_gen_lcnet` is repeated in mobilenetv3.py.Remove the duplicate code.
2024-05-15 17:59:34 +08:00
Ross Wightman
7b3b11b63f
Support loading of paligemma weights into GAP variants of SigLIP ViT. Minor tweak to npz loading for packed transformer weights.
2024-05-14 15:44:37 -07:00
Beckschen
df304ffbf2
the dataclass init needs to use the default factory pattern, according to Ross
2024-05-14 15:10:05 -04:00
Ross Wightman
a69863ad61
Merge pull request #2156 from huggingface/hiera
...
WIP Hiera implementation.
2024-05-13 14:58:12 -07:00
Ross Wightman
f7aa0a1a71
Add missing vit_wee weight
2024-05-13 12:05:47 -07:00
Ross Wightman
7a4e987b9f
Hiera weights on hub
2024-05-13 11:43:22 -07:00
Ross Wightman
23f09af08e
Merge branch 'main' into efficientnet_x
2024-05-12 21:31:08 -07:00
Ross Wightman
c838c4233f
Add typing to reset_classifier() on other models
2024-05-12 11:12:00 -07:00
Ross Wightman
3e03b2bf3f
Fix a few more hiera API issues
2024-05-12 11:11:45 -07:00
Ross Wightman
211d18d8ac
Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming
2024-05-11 23:37:35 -07:00
Ross Wightman
2ca45a4ff5
Merge remote-tracking branch 'upstream/main' into hiera
2024-05-11 15:43:05 -07:00
Ross Wightman
1d3ab176bc
Remove debug / staging code
2024-05-10 22:16:34 -07:00
Ross Wightman
aa4d06a11c
sbb vit weights on hub, testing
2024-05-10 17:15:01 -07:00
Ross Wightman
3582ca499e
Prepping weight push, benchmarking.
2024-05-10 14:14:06 -07:00
Ross Wightman
2bfa5e5d74
Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule
2024-05-06 16:32:49 -07:00
Beckschen
99d4c7d202
add ViTamin models
2024-05-05 02:50:14 -04:00
Ross Wightman
07535f408a
Add AttentionExtract helper module
2024-05-04 14:10:00 -07:00
Ross Wightman
45b7ae8029
forward_intermediates() support for byob/byoanet models
2024-05-04 14:06:52 -07:00
Ross Wightman
c4b8897e9e
attention -> attn in davit for model consistency
2024-05-04 14:06:11 -07:00
Ross Wightman
cb57a96862
Fix early stop for efficientnet/mobilenetv3 fwd inter. Fix indices typing for all fwd inter.
2024-05-04 10:21:58 -07:00
Ross Wightman
01dd01b70e
forward_intermediates() for MlpMixer models and RegNet.
2024-05-04 10:21:03 -07:00
Ross Wightman
f8979d4f50
Comment out time local files while testing new vit weights
2024-05-03 20:26:56 -07:00
Ross Wightman
c719f7eb86
More forward_intermediates() updates
...
* add convnext, resnet, efficientformer, levit support
* remove kwargs only for fn so that torchscript isn't broken for all :(
* use reset_classifier() consistently in prune
2024-05-03 16:22:32 -07:00
Ross Wightman
d6da4fb01e
Add forward_intermediates() to efficientnet / mobilenetv3 based models as an exercise.
2024-05-02 14:19:16 -07:00
Ross Wightman
c22efb9765
Add wee & little vits for some experiments
2024-05-02 10:51:35 -07:00
Ross Wightman
67332fce24
Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights.
2024-04-30 16:56:33 -07:00
user-miner1
740f4983b3
Assert messages added
2024-04-30 10:10:02 +03:00
Ross Wightman
c6db4043cd
Update forward_intermediates for hiera to have its own fwd impl w/ early stopping. Remove return_intermediates bool from forward(). Still an fx issue with None mask arg :(
2024-04-29 17:23:37 -07:00
Ross Wightman
9b9a356a04
Add forward_intermediates support for xcit, cait, and volo.
2024-04-29 16:30:45 -07:00
Ross Wightman
ef147fd2fb
Add forward_intermediates API to Hiera for features_only=True support
2024-04-21 11:30:41 -07:00
Ross Wightman
d88bed6535
Bit more Hiera fiddling
2024-04-21 09:36:57 -07:00
Ross Wightman
8a54d2a930
WIP Hiera implementation. Fix #2083 . Trying to get image size adaptation to work.
2024-04-20 09:47:17 -07:00
Ross Wightman
d6b95520f1
Merge pull request #2136 from huggingface/vit_features_only
...
Exploring vit features_only via new forward_intermediates() API, inspired by #2131
2024-04-11 08:38:20 -07:00
Ross Wightman
4b2565e4cb
More forward_intermediates() / FeatureGetterNet work
...
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman
ef9c6fb846
forward_head(), consistent pre_logits handling to reduce likelihood of people manually replacing .head module having issues
2024-04-09 21:54:59 -07:00
Ross Wightman
679daef76a
More forward_intermediates() & features_only work
...
* forward_intermediates() added to beit, deit, eva, mvitv2, twins, vit, vit_sam
* add features_only to forward intermediates to allow just intermediate features
* fix #2060
* fix #1374
* fix #657
2024-04-09 21:29:16 -07:00
Ross Wightman
17b892f703
Fix #2139 , disable strict weight loading when head changes from classification
2024-04-09 08:41:37 -07:00
Ross Wightman
5fdc0b4e93
Exploring vit features_only using get_intermediate_layers() as per #2131
2024-04-07 11:24:45 -07:00
Ross Wightman
34b41b143c
Fiddling with efficientnet x/h defs, is it worth adding & training any?
2024-03-22 17:55:02 -07:00
Ross Wightman
c559c3911f
Improve vit conversions. OpenAI convert pass through main convert for patch & pos resize. Fix #2120
2024-03-21 10:00:43 -07:00
Ross Wightman
256cf19148
Rename tinyclip models to fit existing 'clip' variants, use consistently mapped OpenCLIP compatible checkpoint on hf hub
2024-03-20 15:21:46 -07:00
Thien Tran
1a1d07d479
add other tinyclip
2024-03-19 07:27:09 +08:00
Thien Tran
dfffffac55
add tinyclip 8m
2024-03-19 07:02:17 +08:00
Ross Wightman
6ccb7d6a7c
Merge pull request #2111 from jamesljlster/enhance_vit_get_intermediate_layers
...
Vision Transformer (ViT) get_intermediate_layers: enhanced to support dynamic image size and saved computational costs from unused blocks
2024-03-18 13:41:18 -07:00
Cheng-Ling Lai
db06b56d34
Saved computational costs of get_intermediate_layers() from unused blocks
2024-03-17 21:34:06 +08:00
Cheng-Ling Lai
4731e4efc4
Modified ViT get_intermediate_layers() to support dynamic image size
2024-03-16 23:07:21 +08:00
SmilingWolf
59cb0be595
SwinV2: add configurable act_layer argument
...
Defaults to "gelu", but makes it possible to pass "gelu_tanh".
Makes it easier to port weights from JAX/Flax, where the tanh
approximation is the default.
2024-03-05 22:04:17 +01:00
Ross Wightman
31e0dc0a5d
Tweak hgnet before merge
2024-02-12 15:00:32 -08:00
Ross Wightman
3e03491e49
Merge branch 'master' of https://github.com/seefun/pytorch-image-models into seefun-master
2024-02-12 14:59:54 -08:00
Ross Wightman
59239d9df5
Cleanup imports for vit relpos
2024-02-10 21:40:57 -08:00
Ross Wightman
ac1b08deb6
fix_init on vit & relpos vit
2024-02-10 20:15:37 -08:00
Ross Wightman
935950cc11
Fix F.sdpa attn drop prob
2024-02-10 20:14:47 -08:00
Ross Wightman
0737cf231d
Add Next-ViT
2024-02-10 17:05:16 -08:00
Ross Wightman
d6c2cc91af
Make NormMlpClassifier head reset args consistent with ClassifierHead
2024-02-10 16:25:33 -08:00
Ross Wightman
87fec3dc14
Update experimental vit model configs
2024-02-10 16:05:58 -08:00
Ross Wightman
7d3c2dc993
Add group_matcher for DaViT
2024-02-10 14:58:45 -08:00
Ross Wightman
88889de923
Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg
2024-01-27 13:48:33 -08:00
Ross Wightman
d4386219c6
Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero
2024-01-26 16:35:51 -08:00
Ross Wightman
3234daf783
Add missing deprecation mapping for a densenet and xcit model. Fix #2086 . Tweak xcit pos embed use of arange for better low prec safety.
2024-01-24 22:04:04 -08:00
Li zhuoqun
53a4888328
Add droppath and type hint to Xception.
2024-01-19 11:15:47 -08:00
方曦
9dbea3bef6
fix cls head in hgnet
2023-12-27 21:26:26 +08:00
SeeFun
56ae8b906d
fix reset head in hgnet
2023-12-27 20:11:29 +08:00
SeeFun
6862c9850a
fix backward in hgnet
2023-12-27 16:49:37 +08:00
SeeFun
6cd28bc5c2
Merge branch 'huggingface:main' into master
2023-12-27 16:43:37 +08:00
Ross Wightman
f2fdd97e9f
Add parsable json results output for train.py, tweak --pretrained-path to force head adaptation
2023-12-22 11:18:25 -08:00
LR
e0079c92da
Update eva.py ( #2058 )
...
* Update eva.py
When argument class token = False, self.cls_token = None.
Prevents error from attempting trunc_normal_ on None:
AttributeError: 'NoneType' object has no attribute 'uniform_'
* Update eva.py
fix
2023-12-16 15:10:45 -08:00
Li zhuoqun
7da34a999a
add type annotations in the code of swin_transformer_v2
2023-12-15 09:31:25 -08:00
Fredo Guan
bbe798317f
Update EdgeNeXt to use ClassifierHead as per ConvNeXt ( #2051 )
...
* Update edgenext.py
2023-12-11 12:17:19 -08:00
Ross Wightman
60b170b200
Add --pretrained-path arg to train script to allow passing local checkpoint as pretrained. Add missing/unexpected keys log.
2023-12-11 12:10:29 -08:00
Fredo Guan
2597ce2860
Update davit.py
2023-12-11 11:13:04 -08:00
akiyuki ishikawa
2bd043ce5d
fix doc position
2023-12-05 12:00:51 -08:00
akiyuki ishikawa
4f2e1bf4cb
Add missing docs in SwinTransformerStage
2023-12-05 12:00:51 -08:00
Ross Wightman
cd8d9d9ff3
Add missing hf hub entries for mvitv2
2023-11-26 21:06:39 -08:00
Ross Wightman
b996c1a0f5
A few more missed hf hub entries
2023-11-23 21:48:14 -08:00
Ross Wightman
89ec91aece
Add missing hf_hub entry for mobilnetv3_rw
2023-11-23 12:44:59 -08:00