Ross Wightman
3a8a965891
Implement absolute+window pos embed for hiera, resizable but needs new weights
2024-07-18 21:43:37 -07:00
Ross Wightman
7160af4a24
Merge pull request #2229 from Promisery/reg_token
...
Initialize weights of reg_token for ViT
2024-07-18 09:25:29 -07:00
Ross Wightman
392b78aee7
set_input_size initial impl for vit & swin v1. Move HybridEmbed to own location in timm/layers
2024-07-17 15:25:48 -07:00
Ross Wightman
34c9fee554
Fix pass through of input / target keys so ImageDataset readers so args work with hfds instead of just hfids (iterable)
2024-07-17 10:11:46 -07:00
Promisery
417cf7f871
Initialize weights of reg_token for ViT
2024-07-13 11:11:42 +08:00
Ross Wightman
f920119f3b
Fixing tests
2024-07-09 14:53:20 -07:00
Ross Wightman
644abf9588
Fix default_cfg test for mobilenet_100
2024-07-09 12:52:24 -07:00
Ross Wightman
d5afe106dc
Merge remote-tracking branch 'origin/tiny_test_models' into small_things
2024-07-09 12:49:57 -07:00
Ross Wightman
55101028bb
Rename test_tiny* -> test*. Fix ByobNet BasicBlock attn location and add test_byobnet model.
2024-07-09 11:53:11 -07:00
Ross Wightman
1334598462
Add support back to EfficientNet to disable head_conv / bn2 so mobilnetv1 can be implemented properly
2024-07-08 13:51:26 -07:00
Ross Wightman
800405d941
Add conv_large mobilenetv3 aa/blur model defs
2024-07-08 13:50:05 -07:00
Ross Wightman
f81b094aaa
Add 'qkv_bias_separate' flag for EVA/beit/swinv2 attn modules to allow an override for easy quantization wrappers. Fix #2098
2024-07-08 13:48:38 -07:00
Ross Wightman
83c2c2f0c5
Add 'Maybe' PIL / image tensor conversions in case image alread in tensor format
2024-07-08 13:43:51 -07:00
Steffen Schneider
c01a47c9e7
Fix typo in type annotations in timm.models.hrnet
2024-07-08 00:53:16 +02:00
Daniel Suess
197c10463b
Fix jit.script breaking with features_fx
2024-06-28 03:58:51 +00:00
Ross Wightman
b751da692d
Add latest ix (xavier init for mqa) hybrid medium & large weights for MobileNetV4
2024-06-24 13:49:55 -07:00
Ross Wightman
d4d4d84fda
Dev version 1.0.8.dev0
2024-06-24 11:34:13 -07:00
Ross Wightman
f8342a045a
Merge pull request #2213 from huggingface/florence2
...
Fix #2212 map florence2 image tower to davit with a few changes
2024-06-24 11:01:08 -07:00
Sejik
c33a001397
Fix typo
2024-06-24 21:54:38 +09:00
Ross Wightman
02d0f27721
cleanup davit padding
2024-06-22 12:06:46 -07:00
Ross Wightman
c715c724e7
Fix tracing by removing float cast, should end up float anyways
2024-06-22 08:35:30 -07:00
Ross Wightman
fb58a73033
Fix #2212 map florence2 image tower to davit with a few changes
2024-06-21 15:31:29 -07:00
Ross Wightman
b28945ff05
Version 1.0.7, prep for release
2024-06-18 16:19:43 -07:00
Ross Wightman
fb13e6385e
Merge pull request #2203 from huggingface/more_mobile
...
Add mobilenet edgetpu defs for exp, add ol mobilenet v1 back for comp…
2024-06-18 15:20:01 -07:00
Ross Wightman
16e082e1c2
Add mobilenetv4 hybrid-large weights
2024-06-17 11:08:31 -07:00
Ross Wightman
e41125cc83
Merge pull request #2209 from huggingface/fcossio-vit-maxpool
...
ViT pooling refactor
2024-06-17 07:51:12 -07:00
Ross Wightman
a22466852d
Add 2400 epoch mobilenetv4 small weights, almost at paper, rounds to 73.8
2024-06-16 10:51:00 -07:00
Ross Wightman
b1a6f4a946
Some missed reset_classifier() type annotations
2024-06-16 10:39:27 -07:00
Ross Wightman
71101ebba0
Refactor vit pooling to add more reduction options, separately callable
2024-06-14 23:16:58 -07:00
Ross Wightman
a0bb5b4a44
Missing stem_kernel_size argument in EfficientNetFeatures
2024-06-14 13:39:31 -07:00
Fernando Cossio
9567cf6d84
Feature: add option global_pool='max' to VisionTransformer
...
Most of the CNNs have a max global pooling option. I would like to extend ViT to have this option.
2024-06-14 15:24:54 +02:00
Ross Wightman
9613c76844
Add mobilenet edgetpu defs for exp, add ol mobilenet v1 back for completeness / comparison
2024-06-13 17:33:04 -07:00
Ross Wightman
22de845add
Prepping for final MobileCLIP weight locations ( #2199 )
...
* Prepping for final MobileCLIP weight locations
* Update weight locations to coreml-projects
* Update mobileclip weight locations with final apple org location
2024-06-13 16:55:49 -07:00
Ross Wightman
575978ba55
Add mnv4_conv_large 384x384 weight location
2024-06-13 12:58:04 -07:00
Ross Wightman
7b5f17d1bd
Update README.md, bump dev version 1.0.6
2024-06-12 12:35:44 -07:00
Ross Wightman
e42e453128
Fix mmnv4 conv_large weight link, reorder mnv4 pretrained cfg for proper precedence
2024-06-12 11:16:49 -07:00
Ross Wightman
7b0a5321cb
Merge pull request #2198 from huggingface/openai_clip_resnet
...
Mapping OpenAI CLIP Modified ResNet weights -> ByobNet.
2024-06-12 09:33:30 -07:00
Ross Wightman
57adc1acc8
Fix rotary embed version of attn pool. Bit of cleanup/naming
2024-06-11 23:49:17 -07:00
Ross Wightman
cdc7bcea69
Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool.
2024-06-11 21:32:07 -07:00
Ross Wightman
c63da1405c
Pretrained cfg name mismatch
2024-06-11 21:16:54 -07:00
Ross Wightman
88efca1be2
First set of MobileNetV4 weights trained in timm
2024-06-11 18:53:01 -07:00
Ross Wightman
30ffa152de
Fix load of larger ResNet CLIP models, experimenting with making AttentionPool *the* head, seems to fine-tune better, one less layer.
2024-06-10 12:07:14 -07:00
Ross Wightman
5e9ff5798f
Adding pos embed resize fns to FX autowrap exceptions
2024-06-10 12:06:47 -07:00
Ross Wightman
f0fb471b26
Remove separate ConvNormActAa class, merge with ConvNormAct
2024-06-10 12:05:35 -07:00
Ross Wightman
5efa15b2a2
Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731
2024-06-09 16:54:48 -07:00
Ross Wightman
7702d9afa1
ViTamin in_chans !=3 weight load fix
2024-06-07 20:39:23 -07:00
Ross Wightman
66a0eb4673
Experimenting with tiny test models, how small can they go and be useful for regression tests?
2024-06-07 16:09:25 -07:00
Ross Wightman
5ee06760dc
Fix classifier input dim for mnv3 after last changes
2024-06-07 13:53:13 -07:00
Ross Wightman
a5a2ad2e48
Fix consistency, testing for forward_head w/ pre_logits, reset_classifier, models with pre_logits size != unpooled feature size
...
* add test that model supports forward_head(x, pre_logits=True)
* add head_hidden_size attr to all models and set differently from num_features attr when head has hidden layers
* test forward_features() feat dim == model.num_features and pre_logits feat dim == self.head_hidden_size
* more consistency in reset_classifier signature, add typing
* asserts in some heads where pooling cannot be disabled
Fix #2194
2024-06-07 13:53:00 -07:00
Ross Wightman
4535a5412a
Change default serialization for push_to_hf_hub to 'both'
2024-06-07 13:40:31 -07:00
Ross Wightman
5cce2185e1
Update version.py
2024-06-07 13:13:23 -07:00
Ross Wightman
7ccb10ebff
Disable efficient_builder debug flag
2024-06-06 21:50:27 -07:00
Ross Wightman
ad026e6e33
Fix in_chans switching on create
2024-06-06 17:56:14 -07:00
Ross Wightman
fc1b66a51d
Fix first conv name for mci vit-b
2024-06-06 13:42:26 -07:00
Ross Wightman
88a1006e02
checkpoint filter fns with consistent name, add mobileclip-b pretrained cfgs
2024-06-06 12:38:52 -07:00
Ross Wightman
7d4ada6d16
Update ViTamin model defs
2024-06-06 09:16:43 -07:00
Ross Wightman
cc8a03daac
Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers
2024-06-06 09:15:27 -07:00
Ross Wightman
3c9d8e5b33
Merge remote-tracking branch 'origin/efficientnet_x' into fastvit_mobileclip
2024-06-05 17:35:15 -07:00
Ross Wightman
5756a81c55
Merge remote-tracking branch 'origin/Beckschen-vitamin' into fastvit_mobileclip
2024-06-05 15:20:54 -07:00
Ross Wightman
58591a97f7
Enable features_only properly
2024-06-04 16:57:16 -07:00
Ross Wightman
1b66ec7cf3
Fixup ViTamin, add hub weight reference
2024-06-03 17:14:03 -07:00
Ross Wightman
b2c0aeb0ec
Merge branch 'main' of https://github.com/Beckschen/pytorch-image-models into Beckschen-vitamin
2024-06-02 14:16:30 -07:00
Ross Wightman
7f96538052
Add missing lkc act for mobileclip fastvits
2024-05-31 11:59:51 -07:00
Ross Wightman
a503639bcc
Add mobileclip fastvit model defs, support extra SE. Add forward_intermediates API to fastvit
2024-05-30 10:17:38 -07:00
Ross Wightman
5fa6efa158
Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599
...
* create_aa helper function centralized for all timm uses (resnet, convbnact helper)
* allow BlurPool w/ pre-defined channels (expand)
* mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added
* improve more mobilenetv3 and efficientnet related type annotations
2024-05-27 22:06:22 -07:00
Ross Wightman
5dce710101
Add vit_little in12k + in12k-ft-in1k weights
2024-05-27 14:56:03 -07:00
Ross Wightman
3c0283f9ef
Fix reparameterize for NextViT. Fix #2187
2024-05-27 14:48:58 -07:00
Ross Wightman
4ff7c25766
Pass layer_scale_init_value to Mnv3Features module
2024-05-24 16:44:50 -07:00
Ross Wightman
a12b72b5c4
Fix missing head_norm arg pop for feature model
2024-05-24 15:50:34 -07:00
Ross Wightman
7fe96e7a92
More MobileNet-v4 fixes
...
* missed final norm after post pooling 1x1 PW head conv
* improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output
* allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it
2024-05-24 15:09:29 -07:00
Ross Wightman
28d76a97db
Mixed up kernel size for last blocks in mnv4-conv-small
2024-05-24 11:50:42 -07:00
Ross Wightman
0c6a69e7ef
Add comments to MNV4 model defs with block variants
2024-05-23 15:54:05 -07:00
Ross Wightman
cb33956b20
Fix some mistakes in mnv4 model defs
2024-05-23 14:24:32 -07:00
Ross Wightman
70176a2dae
torchscript typing fixes
2024-05-23 11:43:05 -07:00
Ross Wightman
2a1a6b1236
Adding missing attention2d.py
2024-05-23 11:06:32 -07:00
Ross Wightman
cee79dada0
Merge remote-tracking branch 'origin/main' into efficientnet_x
2024-05-23 11:01:39 -07:00
Ross Wightman
6a8bb03330
Initial MobileNetV4 pass
2024-05-23 10:49:18 -07:00
Ross Wightman
e748805be3
Add regex matching support to AttentionExtract. Add return_dict support to graph extractors and use returned output in AttentionExtractor
2024-05-22 14:33:39 -07:00
Ross Wightman
44f72c04b3
Change node/module name matching for AttentionExtract so it keeps outputs in order. #1232
2024-05-22 13:45:25 -07:00
Ross Wightman
84cb225ecb
Add in12k + 12k_ft_in1k vit_medium weights
2024-05-20 15:52:46 -07:00
Ross Wightman
4634c3e134
Version 1.0.4.dev0
2024-05-20 15:52:27 -07:00
Beckschen
7a2ad6bce1
Add link to model weights on Hugging Face
2024-05-17 06:51:35 -04:00
Beckschen
530fb49e7e
Add link to model weights on Hugging Face
2024-05-17 06:48:59 -04:00
Fernando Cossio
9b11801cb4
Credit earlier work with the same idea.
...
Hi, this earlier work has the same name and idea behind this layer. It could be useful for readers to keep both links here if they want to see the effects of introducing this layer on a very different domain. 😄
2024-05-16 22:50:34 +02:00
Ross Wightman
cb0e4391be
Release 1.0.3
2024-05-15 11:06:22 -07:00
Ross Wightman
27fd2f35d3
Merge pull request #2181 from huggingface/Delaunay-dist-backend
...
Delaunay dist backend flag
2024-05-15 10:00:59 -07:00
Ross Wightman
e57625e814
Tweak dist_backend to use device_type (before possible :)
2024-05-15 08:49:25 -07:00
Ross Wightman
6ca92570f7
Merge branch 'patch-1' of https://github.com/Delaunay/pytorch-image-models into Delaunay-dist-backend
2024-05-15 08:40:58 -07:00
Ross Wightman
cd0e7b11ff
Merge pull request #2180 from yvonwin/main
...
Remove a duplicate function in mobilenetv3.py
2024-05-15 07:54:17 -07:00
Ross Wightman
83aee5c28c
Add explicit GAP (avg pool) variants of other SigLIP models.
2024-05-15 07:53:19 -07:00
yvonwin
58f2f79b04
Remove a duplicate function in mobilenetv3.py: `_gen_lcnet` is repeated in mobilenetv3.py.Remove the duplicate code.
2024-05-15 17:59:34 +08:00
Ross Wightman
7b3b11b63f
Support loading of paligemma weights into GAP variants of SigLIP ViT. Minor tweak to npz loading for packed transformer weights.
2024-05-14 15:44:37 -07:00
Beckschen
df304ffbf2
the dataclass init needs to use the default factory pattern, according to Ross
2024-05-14 15:10:05 -04:00
Ross Wightman
cc5f2f6f70
version 1.0.2dev0
2024-05-13 15:25:15 -07:00
Ross Wightman
3bfd036b58
Add normalize flag to transforms factory, allow return of non-normalized native dtype torch.Tensors
2024-05-13 15:23:25 -07:00
Ross Wightman
a69863ad61
Merge pull request #2156 from huggingface/hiera
...
WIP Hiera implementation.
2024-05-13 14:58:12 -07:00
Setepenre
8848dad362
Update distributed.py
2024-05-13 16:55:42 -04:00
Ross Wightman
f7aa0a1a71
Add missing vit_wee weight
2024-05-13 12:05:47 -07:00
Ross Wightman
7a4e987b9f
Hiera weights on hub
2024-05-13 11:43:22 -07:00
Ross Wightman
23f09af08e
Merge branch 'main' into efficientnet_x
2024-05-12 21:31:08 -07:00
Ross Wightman
c838c4233f
Add typing to reset_classifier() on other models
2024-05-12 11:12:00 -07:00
Ross Wightman
3e03b2bf3f
Fix a few more hiera API issues
2024-05-12 11:11:45 -07:00
Ross Wightman
211d18d8ac
Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming
2024-05-11 23:37:35 -07:00
Ross Wightman
2ca45a4ff5
Merge remote-tracking branch 'upstream/main' into hiera
2024-05-11 15:43:05 -07:00
Ross Wightman
1d3ab176bc
Remove debug / staging code
2024-05-10 22:16:34 -07:00
Ross Wightman
aa4d06a11c
sbb vit weights on hub, testing
2024-05-10 17:15:01 -07:00
Ross Wightman
3582ca499e
Prepping weight push, benchmarking.
2024-05-10 14:14:06 -07:00
Ross Wightman
2bfa5e5d74
Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule
2024-05-06 16:32:49 -07:00
Beckschen
99d4c7d202
add ViTamin models
2024-05-05 02:50:14 -04:00
Ross Wightman
07535f408a
Add AttentionExtract helper module
2024-05-04 14:10:00 -07:00
Ross Wightman
45b7ae8029
forward_intermediates() support for byob/byoanet models
2024-05-04 14:06:52 -07:00
Ross Wightman
c4b8897e9e
attention -> attn in davit for model consistency
2024-05-04 14:06:11 -07:00
Ross Wightman
cb57a96862
Fix early stop for efficientnet/mobilenetv3 fwd inter. Fix indices typing for all fwd inter.
2024-05-04 10:21:58 -07:00
Ross Wightman
01dd01b70e
forward_intermediates() for MlpMixer models and RegNet.
2024-05-04 10:21:03 -07:00
Ross Wightman
f8979d4f50
Comment out time local files while testing new vit weights
2024-05-03 20:26:56 -07:00
Ross Wightman
c719f7eb86
More forward_intermediates() updates
...
* add convnext, resnet, efficientformer, levit support
* remove kwargs only for fn so that torchscript isn't broken for all :(
* use reset_classifier() consistently in prune
2024-05-03 16:22:32 -07:00
Ross Wightman
301d0bb21f
Stricter check on pool_type for adaptive pooling module. Fix #2159
2024-05-03 16:16:51 -07:00
Ross Wightman
d6da4fb01e
Add forward_intermediates() to efficientnet / mobilenetv3 based models as an exercise.
2024-05-02 14:19:16 -07:00
Ross Wightman
c22efb9765
Add wee & little vits for some experiments
2024-05-02 10:51:35 -07:00
Ross Wightman
67332fce24
Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights.
2024-04-30 16:56:33 -07:00
user-miner1
740f4983b3
Assert messages added
2024-04-30 10:10:02 +03:00
Ross Wightman
c6db4043cd
Update forward_intermediates for hiera to have its own fwd impl w/ early stopping. Remove return_intermediates bool from forward(). Still an fx issue with None mask arg :(
2024-04-29 17:23:37 -07:00
Ross Wightman
9b9a356a04
Add forward_intermediates support for xcit, cait, and volo.
2024-04-29 16:30:45 -07:00
Ross Wightman
ef147fd2fb
Add forward_intermediates API to Hiera for features_only=True support
2024-04-21 11:30:41 -07:00
Ross Wightman
d88bed6535
Bit more Hiera fiddling
2024-04-21 09:36:57 -07:00
Ross Wightman
8a54d2a930
WIP Hiera implementation. Fix #2083 . Trying to get image size adaptation to work.
2024-04-20 09:47:17 -07:00
Ross Wightman
de15b8b828
Next release will be 1.0 :o
2024-04-11 08:55:27 -07:00
Ross Wightman
c8da47a773
Update version.py
2024-04-11 08:45:50 -07:00
Ross Wightman
d6b95520f1
Merge pull request #2136 from huggingface/vit_features_only
...
Exploring vit features_only via new forward_intermediates() API, inspired by #2131
2024-04-11 08:38:20 -07:00
Ross Wightman
24f6d4f7f8
Fix #2127 move to ema device
2024-04-10 21:29:09 -07:00
Ross Wightman
4b2565e4cb
More forward_intermediates() / FeatureGetterNet work
...
* include relpos vit
* refactor reduction / size calcs so hybrid vits work and dynamic_img_size works
* fix -ve feature indices when pruning
* fix mvitv2 w/ class token
* refine naming
* add tests
2024-04-10 15:11:34 -07:00
Ross Wightman
ef9c6fb846
forward_head(), consistent pre_logits handling to reduce likelihood of people manually replacing .head module having issues
2024-04-09 21:54:59 -07:00
Ross Wightman
679daef76a
More forward_intermediates() & features_only work
...
* forward_intermediates() added to beit, deit, eva, mvitv2, twins, vit, vit_sam
* add features_only to forward intermediates to allow just intermediate features
* fix #2060
* fix #1374
* fix #657
2024-04-09 21:29:16 -07:00
Ross Wightman
c28ee2e904
Merge pull request #2145 from huggingface/fix_imagenet22k_ms_mapping
...
Add teddy-bear class back to first 1000 classes of imagenet22k_ms_synsets (line 851, index 850)
2024-04-09 14:56:31 -07:00
Ross Wightman
f5ea076a46
Merge pull request #2143 from huggingface/fix_asymm_set_grad_enable
...
Fix #2132 , remove use of _C.set_grad_enable. Line endings were messed up too
2024-04-09 10:14:13 -07:00
Ross Wightman
286d941923
Add teddy-bear class back to first 1000 classes of imagenet22k_ms_synsets (index 851)
2024-04-09 09:33:08 -07:00
Ross Wightman
5c5ae8d401
Fix #2132 , remove use of _C.set_grad_enable. Line endings were messed up too
2024-04-09 09:00:23 -07:00
Ross Wightman
17b892f703
Fix #2139 , disable strict weight loading when head changes from classification
2024-04-09 08:41:37 -07:00
Ross Wightman
5fdc0b4e93
Exploring vit features_only using get_intermediate_layers() as per #2131
2024-04-07 11:24:45 -07:00
fzyzcjy
b44e4e45a2
more
2024-04-02 10:25:30 +08:00
fzyzcjy
8880a5cd5c
Update scheduler.py
2024-03-23 11:27:33 +08:00
Ross Wightman
34b41b143c
Fiddling with efficientnet x/h defs, is it worth adding & training any?
2024-03-22 17:55:02 -07:00
Ross Wightman
c559c3911f
Improve vit conversions. OpenAI convert pass through main convert for patch & pos resize. Fix #2120
2024-03-21 10:00:43 -07:00
Ross Wightman
256cf19148
Rename tinyclip models to fit existing 'clip' variants, use consistently mapped OpenCLIP compatible checkpoint on hf hub
2024-03-20 15:21:46 -07:00
Thien Tran
1a1d07d479
add other tinyclip
2024-03-19 07:27:09 +08:00
Thien Tran
dfffffac55
add tinyclip 8m
2024-03-19 07:02:17 +08:00
Ross Wightman
6ccb7d6a7c
Merge pull request #2111 from jamesljlster/enhance_vit_get_intermediate_layers
...
Vision Transformer (ViT) get_intermediate_layers: enhanced to support dynamic image size and saved computational costs from unused blocks
2024-03-18 13:41:18 -07:00
Cheng-Ling Lai
db06b56d34
Saved computational costs of get_intermediate_layers() from unused blocks
2024-03-17 21:34:06 +08:00
Cheng-Ling Lai
4731e4efc4
Modified ViT get_intermediate_layers() to support dynamic image size
2024-03-16 23:07:21 +08:00
Ross Wightman
ba641e07ae
Add support for dynamo based onnx export
2024-03-13 12:05:26 -07:00