pytorch-image-models

mirror of https://github.com/huggingface/pytorch-image-models.git synced 2025-06-03 15:01:08 +08:00

Author	SHA1	Message	Date
Ross Wightman	962958723c	More Hiera updates. Add forward_intermediates to hieradat/sam2 impl. Make both use same classifier module. Add coarse bool to intermediates.	2024-08-16 11:10:04 -07:00
Ross Wightman	f2cfb4c677	Add WIP HieraDet impl (SAM2 backbone support)	2024-08-15 17:58:15 -07:00
Ross Wightman	a50e53d41f	Rename global pos embed for Hiera abswin, factor out commonly used vit weight init fns to layers. Add a channels-last ver of normmlp head.	2024-08-15 17:46:36 -07:00
Ross Wightman	2f3fed43b8	Fix hiera init with num_classes=0, fix weight tag names for sbb2 hiera/vit weights, add LayerScale/LayerScale2d to layers	2024-08-15 11:14:38 -07:00
Ross Wightman	ab8cb070fc	Add xavier_uniform init of MNVC hybrid attention modules. Small improvement in training stability.	2024-07-26 17:03:40 -07:00
Ross Wightman	cec70b6779	Merge pull request #2225 from huggingface/small_things Small things	2024-07-25 20:29:13 -07:00
Ross Wightman	7e0caa1ba3	Padding helpers work if tuples/lists passed	2024-07-19 14:28:03 -07:00
Ross Wightman	2180800646	MQA query_strides bugs fix #2237 . No padding for avg_pool2d if not 'same', use scale_factor for Upsample.	2024-07-19 14:26:54 -07:00
Ross Wightman	392b78aee7	set_input_size initial impl for vit & swin v1. Move HybridEmbed to own location in timm/layers	2024-07-17 15:25:48 -07:00
Ross Wightman	57adc1acc8	Fix rotary embed version of attn pool. Bit of cleanup/naming	2024-06-11 23:49:17 -07:00
Ross Wightman	cdc7bcea69	Make 2d attention pool modules compatible with head interface. Use attention pool in CLIP ResNets as head. Make separate set of GAP models w/ avg pool instead of attn pool.	2024-06-11 21:32:07 -07:00
Ross Wightman	30ffa152de	Fix load of larger ResNet CLIP models, experimenting with making AttentionPool the head, seems to fine-tune better, one less layer.	2024-06-10 12:07:14 -07:00
Ross Wightman	5e9ff5798f	Adding pos embed resize fns to FX autowrap exceptions	2024-06-10 12:06:47 -07:00
Ross Wightman	f0fb471b26	Remove separate ConvNormActAa class, merge with ConvNormAct	2024-06-10 12:05:35 -07:00
Ross Wightman	5efa15b2a2	Mapping OpenAI CLIP Modified ResNet weights -> ByobNet. Improve AttentionPool2d layers. Fix #1731	2024-06-09 16:54:48 -07:00
Ross Wightman	cc8a03daac	Add ConvStem and MobileCLIP hybrid model for B variant. Add full norm disable support to ConvNormAct layers	2024-06-06 09:15:27 -07:00
Ross Wightman	5fa6efa158	Add anti-aliasing support to mobilenetv3 and efficientnet family models. Update MobileNetV4 model defs, resolutions. Fix #599 * create_aa helper function centralized for all timm uses (resnet, convbnact helper) * allow BlurPool w/ pre-defined channels (expand) * mobilenetv4 UIB block using ConvNormAct layers for improved clarity, esp with AA added * improve more mobilenetv3 and efficientnet related type annotations	2024-05-27 22:06:22 -07:00
Ross Wightman	7fe96e7a92	More MobileNet-v4 fixes * missed final norm after post pooling 1x1 PW head conv * improve repr of model by flipping a few modules to None when not used, nn.Sequential for MultiQueryAttention query/key/value/output * allow layer scaling to be enabled/disabled at model variant level, conv variants don't use it	2024-05-24 15:09:29 -07:00
Ross Wightman	70176a2dae	torchscript typing fixes	2024-05-23 11:43:05 -07:00
Ross Wightman	2a1a6b1236	Adding missing attention2d.py	2024-05-23 11:06:32 -07:00
Ross Wightman	cee79dada0	Merge remote-tracking branch 'origin/main' into efficientnet_x	2024-05-23 11:01:39 -07:00
Ross Wightman	6a8bb03330	Initial MobileNetV4 pass	2024-05-23 10:49:18 -07:00
Fernando Cossio	9b11801cb4	Credit earlier work with the same idea. Hi, this earlier work has the same name and idea behind this layer. It could be useful for readers to keep both links here if they want to see the effects of introducing this layer on a very different domain. 😄	2024-05-16 22:50:34 +02:00
Ross Wightman	211d18d8ac	Move norm & pool into Hiera ClassifierHead. Misc fixes, update features_intermediate() naming	2024-05-11 23:37:35 -07:00
Ross Wightman	2bfa5e5d74	Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule	2024-05-06 16:32:49 -07:00
Ross Wightman	301d0bb21f	Stricter check on pool_type for adaptive pooling module. Fix #2159	2024-05-03 16:16:51 -07:00
Ross Wightman	4b2565e4cb	More forward_intermediates() / FeatureGetterNet work * include relpos vit * refactor reduction / size calcs so hybrid vits work and dynamic_img_size works * fix -ve feature indices when pruning * fix mvitv2 w/ class token * refine naming * add tests	2024-04-10 15:11:34 -07:00
Ross Wightman	d6c2cc91af	Make NormMlpClassifier head reset args consistent with ClassifierHead	2024-02-10 16:25:33 -08:00
Ross Wightman	7bc7798d0e	Type annotation correctness for create_act	2024-02-10 14:57:58 -08:00
Ross Wightman	88889de923	Fix meshgrid deprecation warnings and backward compat with explicit 'ndgrid' and 'meshgrid' fn w/o indexing arg	2024-01-27 13:48:33 -08:00
Ross Wightman	d4386219c6	Improve type handling for arange & rel pos embeds, keep calculations in float32 until application (may change to apply in float32 in future). Prevent arange type hijacking by DeepSpeed Zero	2024-01-26 16:35:51 -08:00
kalazus	7f19a4cce7	fix fast catavgmax selection	2024-01-16 10:30:08 -08:00
Ross Wightman	df7ae11eb2	Add device arg for patch embed resize, fix #2024	2023-12-04 11:42:13 -08:00
Ross Wightman	9fab8d8f58	Fix break of 2 years old torchvision installs :/	2023-11-04 02:32:09 -07:00
Ross Wightman	f7762fee78	Consistency handling None / empty string inputs to norm / act create fns	2023-11-03 11:01:41 -07:00
Ross Wightman	a2e4a4c148	Add quickgelu vit clip variants, simplify get_norm_layer and allow string args in vit norm/act. Add metaclip CLIP weights	2023-11-03 11:01:41 -07:00
a-r-r-o-w	d5f1525334	include suggestions from review Co-Authored-By: Ross Wightman <rwightman@gmail.com>	2023-10-30 13:47:54 -07:00
a-r-r-o-w	5f14bdd564	include typing suggestions by @rwightman	2023-10-30 13:47:54 -07:00
Laureηt	fe92fd93e5	fix adaptive_avgmax_pool.py remove extra whitespace in `SelectAdaptivePool2d`'s `__repr__`	2023-10-29 23:03:36 -07:00
Tush9905	89ba0da910	Fixed Typos Fixed the typos in helpers.py and CONTRIBUTING.md	2023-10-21 21:46:31 -07:00
Ross Wightman	49a459e8f1	Merge remote-tracking branch 'upstream/main' into vit_siglip_and_reg	2023-10-17 09:36:48 -07:00
Ross Wightman	a58f9162d7	Missed __init__.py update for attention pooling layer add	2023-10-17 09:28:21 -07:00
Ross Wightman	71365165a2	Add SigLIP weights	2023-10-16 23:26:08 -07:00
lucapericlp	7ce65a83a2	Removing unused self.drop	2023-10-05 11:20:57 -07:00
Ross Wightman	9caf32b93f	Move levit style pos bias resize with other rel pos bias utils	2023-09-01 11:05:56 -07:00
方曦	170a5b6e27	add tinyvit	2023-09-01 11:05:56 -07:00
Ross Wightman	fc5d705b83	dynamic_size -> dynamic_img_size, add dynamic_img_pad for padding option	2023-08-27 15:58:35 -07:00
Ross Wightman	1f4512fca3	Support dynamic_resize in eva.py models	2023-08-27 15:58:35 -07:00
Ross Wightman	fdd8c7c2da	Initial impl of dynamic resize for existing vit models (incl vit-resnet hybrids)	2023-08-27 15:58:35 -07:00
Ross Wightman	c153cd4a3e	Add more advanced interpolation method from BEiT and support non-square window & image size adaptation for * beit/beit-v2 * maxxvit/coatnet * swin transformer And non-square windows for swin-v2	2023-08-08 16:41:16 -07:00

1 2

79 Commits