pytorch-image-models

Commit Graph

Author	SHA1	Message	Date
Ross Wightman	cdc7bcea69	Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool.	2024-06-11 21:32:07 -07:00
Ross Wightman	c63da1405c	Pretrained cfg name mismatch	2024-06-11 21:16:54 -07:00
Ross Wightman	88efca1be2	First set of MobileNetV4 weights trained in timm	2024-06-11 18:53:01 -07:00
Ross Wightman	30ffa152de	Fix load of larger ResNet CLIP models, experimenting with making AttentionPool the head, seems to fine-tune better, one less layer.	2024-06-10 12:07:14 -07:00
Ross Wightman	5e9ff5798f	Adding pos embed resize fns to FX autowrap exceptions	2024-06-10 12:06:47 -07:00
Ross Wightman	f0fb471b26	Remove separate ConvNormActAa class, merge with ConvNormAct	2024-06-10 12:05:35 -07:00
Ross Wightman	5efa15b2a2	Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731	2024-06-09 16:54:48 -07:00
Ross Wightman	7702d9afa1	ViTamin in_chans !=3 weight load fix	2024-06-07 20:39:23 -07:00
Ross Wightman	66a0eb4673	Experimenting with tiny test models, how small can they go and be useful for regression tests?	2024-06-07 16:09:25 -07:00
Ross Wightman	5ee06760dc	Fix classifier input dim for mnv3 after last changes	2024-06-07 13:53:13 -07:00
Ross Wightman	a5a2ad2e48	Fix consistency, testing for forward_head w/ pre_logits, reset_classifier, models with pre_logits size != unpooled feature size * add test that model supports forward_head(x, pre_logits=True) * add head_hidden_size attr to all models and set differently from num_features attr when head has hidden layers * test forward_features() feat dim == model.num_features and pre_logits feat dim == self.head_hidden_size * more consistency in reset_classifier signature, add typing * asserts in some heads where pooling cannot be disabled Fix #2194	2024-06-07 13:53:00 -07:00
Ross Wightman	4535a5412a	Change default serialization for push_to_hf_hub to 'both'	2024-06-07 13:40:31 -07:00
Ross Wightman	7ccb10ebff	Disable efficient_builder debug flag	2024-06-06 21:50:27 -07:00
Ross Wightman	ad026e6e33	Fix in_chans switching on create	2024-06-06 17:56:14 -07:00
Ross Wightman	fc1b66a51d	Fix first conv name for mci vit-b	2024-06-06 13:42:26 -07:00
Ross Wightman	88a1006e02	checkpoint filter fns with consistent name, add mobileclip-b pretrained cfgs	2024-06-06 12:38:52 -07:00
Ross Wightman	7d4ada6d16	Update ViTamin model defs	2024-06-06 09:16:43 -07:00
Ross Wightman	cc8a03daac	Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers	2024-06-06 09:15:27 -07:00
Ross Wightman	3c9d8e5b33	Merge remote-tracking branch 'origin/efficientnet_x' into fastvit_mobileclip	2024-06-05 17:35:15 -07:00
Ross Wightman	5756a81c55	Merge remote-tracking branch 'origin/Beckschen-vitamin' into fastvit_mobileclip	2024-06-05 15:20:54 -07:00
Ross Wightman	58591a97f7	Enable features_only properly	2024-06-04 16:57:16 -07:00
Ross Wightman	1b66ec7cf3	Fixup ViTamin, add hub weight reference	2024-06-03 17:14:03 -07:00
Ross Wightman	b2c0aeb0ec	Merge branch 'main' of https://github.com/Beckschen/pytorch-image-models into Beckschen-vitamin	2024-06-02 14:16:30 -07:00
Ross Wightman	7f96538052	Add missing lkc act for mobileclip fastvits	2024-05-31 11:59:51 -07:00
Ross Wightman	a503639bcc	Add mobileclip fastvit model defs, support extra SE. Add forward_intermediates API to fastvit	2024-05-30 10:17:38 -07:00
Ross Wightman	5fa6efa158	Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599 * create_aa helper function centralized for all timm uses (resnet, convbnact helper) * allow BlurPool w/ pre-defined channels (expand) * mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added * improve more mobilenetv3 and efficientnet related type annotations	2024-05-27 22:06:22 -07:00
Ross Wightman	5dce710101	Add vit_little in12k + in12k-ft-in1k weights	2024-05-27 14:56:03 -07:00
Ross Wightman	3c0283f9ef	Fix reparameterize for NextViT. Fix #2187	2024-05-27 14:48:58 -07:00
Ross Wightman	4ff7c25766	Pass layer_scale_init_value to Mnv3Features module	2024-05-24 16:44:50 -07:00
Ross Wightman	a12b72b5c4	Fix missing head_norm arg pop for feature model	2024-05-24 15:50:34 -07:00
Ross Wightman	7fe96e7a92	More MobileNet-v4 fixes * missed final norm after post pooling 1x1 PW head conv * improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output * allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it	2024-05-24 15:09:29 -07:00
Ross Wightman	28d76a97db	Mixed up kernel size for last blocks in mnv4-conv-small	2024-05-24 11:50:42 -07:00
Ross Wightman	0c6a69e7ef	Add comments to MNV4 model defs with block variants	2024-05-23 15:54:05 -07:00
Ross Wightman	cb33956b20	Fix some mistakes in mnv4 model defs	2024-05-23 14:24:32 -07:00
Ross Wightman	cee79dada0	Merge remote-tracking branch 'origin/main' into efficientnet_x	2024-05-23 11:01:39 -07:00
Ross Wightman	6a8bb03330	Initial MobileNetV4 pass	2024-05-23 10:49:18 -07:00
Ross Wightman	e748805be3	Add regex matching support to AttentionExtract. Add return_dict support to graph extractors and use returned output in AttentionExtractor	2024-05-22 14:33:39 -07:00
Ross Wightman	84cb225ecb	Add in12k + 12k_ft_in1k vit_medium weights	2024-05-20 15:52:46 -07:00
Beckschen	7a2ad6bce1	Add link to model weights on Hugging Face	2024-05-17 06:51:35 -04:00
Beckschen	530fb49e7e	Add link to model weights on Hugging Face	2024-05-17 06:48:59 -04:00
Ross Wightman	cd0e7b11ff	Merge pull request #2180 from yvonwin/main Remove a duplicate function in mobilenetv3.py	2024-05-15 07:54:17 -07:00
Ross Wightman	83aee5c28c	Add explicit GAP (avg pool) variants of other SigLIP models.	2024-05-15 07:53:19 -07:00
yvonwin	58f2f79b04	Remove a duplicate function in mobilenetv3.py: `_gen_lcnet` is repeated in mobilenetv3.py.Remove the duplicate code.	2024-05-15 17:59:34 +08:00
Ross Wightman	7b3b11b63f	Support loading of paligemma weights into GAP variants of SigLIP ViT. Minor tweak to npz loading for packed transformer weights.	2024-05-14 15:44:37 -07:00
Beckschen	df304ffbf2	the dataclass init needs to use the default factory pattern, according to Ross	2024-05-14 15:10:05 -04:00
Ross Wightman	a69863ad61	Merge pull request #2156 from huggingface/hiera WIP Hiera implementation.	2024-05-13 14:58:12 -07:00
Ross Wightman	f7aa0a1a71	Add missing vit_wee weight	2024-05-13 12:05:47 -07:00
Ross Wightman	7a4e987b9f	Hiera weights on hub	2024-05-13 11:43:22 -07:00
Ross Wightman	23f09af08e	Merge branch 'main' into efficientnet_x	2024-05-12 21:31:08 -07:00
Ross Wightman	c838c4233f	Add typing to reset_classifier() on other models	2024-05-12 11:12:00 -07:00
Ross Wightman	3e03b2bf3f	Fix a few more hiera API issues	2024-05-12 11:11:45 -07:00
Ross Wightman	211d18d8ac	Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming	2024-05-11 23:37:35 -07:00
Ross Wightman	2ca45a4ff5	Merge remote-tracking branch 'upstream/main' into hiera	2024-05-11 15:43:05 -07:00
Ross Wightman	1d3ab176bc	Remove debug / staging code	2024-05-10 22:16:34 -07:00
Ross Wightman	aa4d06a11c	sbb vit weights on hub, testing	2024-05-10 17:15:01 -07:00
Ross Wightman	3582ca499e	Prepping weight push, benchmarking.	2024-05-10 14:14:06 -07:00
Ross Wightman	2bfa5e5d74	Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule	2024-05-06 16:32:49 -07:00
Beckschen	99d4c7d202	add ViTamin models	2024-05-05 02:50:14 -04:00
Ross Wightman	07535f408a	Add AttentionExtract helper module	2024-05-04 14:10:00 -07:00
Ross Wightman	45b7ae8029	forward_intermediates() support for byob/byoanet models	2024-05-04 14:06:52 -07:00
Ross Wightman	c4b8897e9e	attention -> attn in davit for model consistency	2024-05-04 14:06:11 -07:00
Ross Wightman	cb57a96862	Fix early stop for efficientnet/mobilenetv3 fwd inter. Fix indices typing for all fwd inter.	2024-05-04 10:21:58 -07:00
Ross Wightman	01dd01b70e	forward_intermediates() for MlpMixer models and RegNet.	2024-05-04 10:21:03 -07:00
Ross Wightman	f8979d4f50	Comment out time local files while testing new vit weights	2024-05-03 20:26:56 -07:00
Ross Wightman	c719f7eb86	More forward_intermediates() updates * add convnext, resnet, efficientformer, levit support * remove kwargs only for fn so that torchscript isn't broken for all :( * use reset_classifier() consistently in prune	2024-05-03 16:22:32 -07:00
Ross Wightman	d6da4fb01e	Add forward_intermediates() to efficientnet / mobilenetv3 based models as an exercise.	2024-05-02 14:19:16 -07:00
Ross Wightman	c22efb9765	Add wee & little vits for some experiments	2024-05-02 10:51:35 -07:00
Ross Wightman	67332fce24	Add features_intermediate() support to coatnet, maxvit, swin* models. Refine feature interface. Start prep of new vit weights.	2024-04-30 16:56:33 -07:00
user-miner1	740f4983b3	Assert messages added	2024-04-30 10:10:02 +03:00
Ross Wightman	c6db4043cd	Update forward_intermediates for hiera to have its own fwd impl w/ early stopping. Remove return_intermediates bool from forward(). Still an fx issue with None mask arg :(	2024-04-29 17:23:37 -07:00
Ross Wightman	9b9a356a04	Add forward_intermediates support for xcit, cait, and volo.	2024-04-29 16:30:45 -07:00
Ross Wightman	ef147fd2fb	Add forward_intermediates API to Hiera for features_only=True support	2024-04-21 11:30:41 -07:00
Ross Wightman	d88bed6535	Bit more Hiera fiddling	2024-04-21 09:36:57 -07:00
Ross Wightman	8a54d2a930	WIP Hiera implementation. Fix #2083 . Trying to get image size adaptation to work.	2024-04-20 09:47:17 -07:00
Ross Wightman	d6b95520f1	Merge pull request #2136 from huggingface/vit_features_only Exploring vit features_only via new forward_intermediates() API, inspired by #2131	2024-04-11 08:38:20 -07:00
Ross Wightman	4b2565e4cb	More forward_intermediates() / FeatureGetterNet work * include relpos vit * refactor reduction / size calcs so hybrid vits work and dynamic_img_size works * fix -ve feature indices when pruning * fix mvitv2 w/ class token * refine naming * add tests	2024-04-10 15:11:34 -07:00
Ross Wightman	ef9c6fb846	forward_head(), consistent pre_logits handling to reduce likelihood of people manually replacing .head module having issues	2024-04-09 21:54:59 -07:00
Ross Wightman	679daef76a	More forward_intermediates() & features_only work * forward_intermediates() added to beit, deit, eva, mvitv2, twins, vit, vit_sam * add features_only to forward intermediates to allow just intermediate features * fix #2060 * fix #1374 * fix #657	2024-04-09 21:29:16 -07:00
Ross Wightman	17b892f703	Fix #2139 , disable strict weight loading when head changes from classification	2024-04-09 08:41:37 -07:00
Ross Wightman	5fdc0b4e93	Exploring vit features_only using get_intermediate_layers() as per #2131	2024-04-07 11:24:45 -07:00
Ross Wightman	34b41b143c	Fiddling with efficientnet x/h defs, is it worth adding & training any?	2024-03-22 17:55:02 -07:00
Ross Wightman	c559c3911f	Improve vit conversions. OpenAI convert pass through main convert for patch & pos resize. Fix #2120	2024-03-21 10:00:43 -07:00
Ross Wightman	256cf19148	Rename tinyclip models to fit existing 'clip' variants, use consistently mapped OpenCLIP compatible checkpoint on hf hub	2024-03-20 15:21:46 -07:00
Thien Tran	1a1d07d479	add other tinyclip	2024-03-19 07:27:09 +08:00
Thien Tran	dfffffac55	add tinyclip 8m	2024-03-19 07:02:17 +08:00
Ross Wightman	6ccb7d6a7c	Merge pull request #2111 from jamesljlster/enhance_vit_get_intermediate_layers Vision Transformer (ViT) get_intermediate_layers: enhanced to support dynamic image size and saved computational costs from unused blocks	2024-03-18 13:41:18 -07:00
Cheng-Ling Lai	db06b56d34	Saved computational costs of get_intermediate_layers() from unused blocks	2024-03-17 21:34:06 +08:00
Cheng-Ling Lai	4731e4efc4	Modified ViT get_intermediate_layers() to support dynamic image size	2024-03-16 23:07:21 +08:00
SmilingWolf	59cb0be595	SwinV2: add configurable act_layer argument Defaults to "gelu", but makes it possible to pass "gelu_tanh". Makes it easier to port weights from JAX/Flax, where the tanh approximation is the default.	2024-03-05 22:04:17 +01:00
Ross Wightman	31e0dc0a5d	Tweak hgnet before merge	2024-02-12 15:00:32 -08:00
Ross Wightman	3e03491e49	Merge branch 'master' of https://github.com/seefun/pytorch-image-models into seefun-master	2024-02-12 14:59:54 -08:00
Ross Wightman	59239d9df5	Cleanup imports for vit relpos	2024-02-10 21:40:57 -08:00
Ross Wightman	ac1b08deb6	fix_init on vit & relpos vit	2024-02-10 20:15:37 -08:00
Ross Wightman	935950cc11	Fix F.sdpa attn drop prob	2024-02-10 20:14:47 -08:00
Ross Wightman	0737cf231d	Add Next-ViT	2024-02-10 17:05:16 -08:00
Ross Wightman	d6c2cc91af	Make NormMlpClassifier head reset args consistent with ClassifierHead	2024-02-10 16:25:33 -08:00
Ross Wightman	87fec3dc14	Update experimental vit model configs	2024-02-10 16:05:58 -08:00
Ross Wightman	7d3c2dc993	Add group_matcher for DaViT	2024-02-10 14:58:45 -08:00
Ross Wightman	88889de923	Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg	2024-01-27 13:48:33 -08:00
Ross Wightman	d4386219c6	Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero	2024-01-26 16:35:51 -08:00
Ross Wightman	3234daf783	Add missing deprecation mapping for a densenet and xcit model. Fix #2086 . Tweak xcit pos embed use of arange for better low prec safety.	2024-01-24 22:04:04 -08:00
Li zhuoqun	53a4888328	Add droppath and type hint to Xception.	2024-01-19 11:15:47 -08:00
方曦	9dbea3bef6	fix cls head in hgnet	2023-12-27 21:26:26 +08:00
SeeFun	56ae8b906d	fix reset head in hgnet	2023-12-27 20:11:29 +08:00
SeeFun	6862c9850a	fix backward in hgnet	2023-12-27 16:49:37 +08:00
SeeFun	6cd28bc5c2	Merge branch 'huggingface:main' into master	2023-12-27 16:43:37 +08:00
Ross Wightman	f2fdd97e9f	Add parsable json results output for train.py, tweak --pretrained-path to force head adaptation	2023-12-22 11:18:25 -08:00
LR	e0079c92da	Update eva.py (#2058 ) * Update eva.py When argument class token = False, self.cls_token = None. Prevents error from attempting trunc_normal_ on None: AttributeError: 'NoneType' object has no attribute 'uniform_' * Update eva.py fix	2023-12-16 15:10:45 -08:00
Li zhuoqun	7da34a999a	add type annotations in the code of swin_transformer_v2	2023-12-15 09:31:25 -08:00
Fredo Guan	bbe798317f	Update EdgeNeXt to use ClassifierHead as per ConvNeXt (#2051 ) * Update edgenext.py	2023-12-11 12:17:19 -08:00
Ross Wightman	60b170b200	Add --pretrained-path arg to train script to allow passing local checkpoint as pretrained. Add missing/unexpected keys log.	2023-12-11 12:10:29 -08:00
Fredo Guan	2597ce2860	Update davit.py	2023-12-11 11:13:04 -08:00
akiyuki ishikawa	2bd043ce5d	fix doc position	2023-12-05 12:00:51 -08:00
akiyuki ishikawa	4f2e1bf4cb	Add missing docs in SwinTransformerStage	2023-12-05 12:00:51 -08:00
Ross Wightman	cd8d9d9ff3	Add missing hf hub entries for mvitv2	2023-11-26 21:06:39 -08:00
Ross Wightman	b996c1a0f5	A few more missed hf hub entries	2023-11-23 21:48:14 -08:00
Ross Wightman	89ec91aece	Add missing hf_hub entry for mobilnetv3_rw	2023-11-23 12:44:59 -08:00
Dillon Laird	63ee54853c	fixed intermediate output indices	2023-11-22 16:32:41 -08:00
Ross Wightman	fa06f6c481	Merge branch 'seefun-efficientvit'	2023-11-21 14:06:27 -08:00
Ross Wightman	c6b0c98963	Upload weights to hub, tweak crop_pct, comment out SAM EfficientViTs for now (v2 weights comming)	2023-11-21 14:05:04 -08:00
Ross Wightman	ada145b016	Literal use w/ python < 3.8 requires typing_extension, cach instead of check sys ver	2023-11-21 09:48:03 -08:00
Ross Wightman	dfaab97d20	More consistency in model arg/kwarg merge handling	2023-11-21 09:48:03 -08:00
Ross Wightman	3775e4984f	Merge branch 'efficientvit' of github.com:seefun/pytorch-image-models into seefun-efficientvit	2023-11-20 16:21:38 -08:00
Ross Wightman	dfb8658100	Fix a few missed model deprecations and one missed pretrained cfg	2023-11-20 12:41:49 -08:00
Ross Wightman	a604011935	Add support for passing model args via hf hub config	2023-11-19 15:16:01 -08:00
方曦	c9d093a58e	update norm eps for efficientvit large	2023-11-18 17:46:47 +08:00
Laureηt	21647c0a0c	Add types to vision_transformers.py	2023-11-17 16:06:06 -08:00
方曦	87ba43a9bc	add efficientvit large series	2023-11-17 13:58:46 +08:00
Ross Wightman	7c685a4ef3	Fix openai quickgelu loading and add mnissing orig_in21k vit weights and remove zero'd classifier w/ matching hub update	2023-11-16 19:16:28 -08:00
LittleNyima	ef72c3cd47	Add warnings for duplicate registry names	2023-11-08 10:18:59 -08:00
Ross Wightman	d3e83a190f	Add in12k fine-tuned convnext_xxlarge	2023-11-03 14:35:01 -07:00
Ross Wightman	dcfdba1f5f	Make quickgelu models appear in listing	2023-11-03 11:01:41 -07:00
Ross Wightman	96bd162ddb	Add cc-by-nc-4.0 license for metaclip, make note in quickgelu model def about pretrained_cfg mapping	2023-11-03 11:01:41 -07:00
Ross Wightman	6894ec7edc	Forgot about datcomp b32 models	2023-11-03 11:01:41 -07:00
Ross Wightman	a2e4a4c148	Add quickgelu vit clip variants, simplify get_norm_layer and allow string args in vit norm/act. Add metaclip CLIP weights	2023-11-03 11:01:41 -07:00
Ross Wightman	c55bc41a42	DFN CLIP ViT support	2023-10-31 12:16:21 -07:00
a-r-r-o-w	d5f1525334	include suggestions from review Co-Authored-By: Ross Wightman <rwightman@gmail.com>	2023-10-30 13:47:54 -07:00
a-r-r-o-w	5f14bdd564	include typing suggestions by @rwightman	2023-10-30 13:47:54 -07:00
a-r-r-o-w	05b0aaca51	improvement: add typehints and docs to timm/models/resnet.py	2023-10-30 13:47:54 -07:00
a-r-r-o-w	c2fe0a2268	improvement: add typehints and docs to timm/models/mobilenetv3.py	2023-10-30 13:47:54 -07:00
Laureηt	d023154bb5	Update swin_transformer.py make `SwimTransformer`'s `patch_embed` customizable through the constructor	2023-10-30 13:47:14 -07:00
Ross Wightman	68a121402f	Added hub weights for dinov2 register models	2023-10-29 23:03:48 -07:00
Ross Wightman	3f02392488	Add DINOv2 models with register tokens. Convert pos embed to non-overlapping for consistency.	2023-10-29 23:03:48 -07:00
Patrick Labatut	97450d618a	Update DINOv2 license to Apache 2.0	2023-10-27 09:12:51 -07:00
mjamroz	7a6369156f	avoid getting undefined	2023-10-22 21:36:23 -07:00
pUmpKin-Co	8556462a18	fix doc typo in resnetv2	2023-10-20 11:56:50 -07:00
Ross Wightman	462fb3ec9f	Push new repvit weights to hub, tweak tag names	2023-10-20 11:49:29 -07:00
Ross Wightman	5309424d5e	Merge branch 'main' of https://github.com/jameslahm/pytorch-image-models into jameslahm-main	2023-10-20 11:08:12 -07:00
Ross Wightman	d3ebdcfd93	Disable strict load when siglip vit pooling removed	2023-10-19 12:03:40 -07:00
Ross Wightman	e728f3efdb	Cleanup ijepa models, they're just gap (global-avg-pool) models w/o heads. fc-norm conversion was wrong, gigantic should have been giant	2023-10-17 15:44:46 -07:00

1 2 3 4 5 ...

1278 Commits (72f0edb7e88556dafd9d2c4bc16bd5c19f84834b)