140 Commits

Author SHA1 Message Date
Ross Wightman
93cc08fdc5 Make evonorm variables 1d to match other PyTorch norm layers, will break weight compat for any existing use (likely minimal, easy to fix). 2021-11-20 15:50:51 -08:00
Ross Wightman
af607b75cc Prep a set of ResNetV2 models with GroupNorm, EvoNormB0, EvoNormS0 for BN free model experiments on TPU and IPU 2021-11-19 17:37:00 -08:00
Ross Wightman
c976a410d9 Add ResNet-50 w/ GN (resnet50_gn) and SEBotNet-33-TS (sebotnet33ts_256) model defs and weights. Update halonet50ts weights w/ slightly better variant in1k val, more robust to test sets. 2021-11-19 14:24:43 -08:00
Alexander Soare
b25ff96768 wip - pre-rebase 2021-11-12 20:45:05 +00:00
Alexander Soare
e051dce354 Make all models FX traceable 2021-11-12 20:45:05 +00:00
Alexander Soare
0149ec30d7 wip - attempting to rebase 2021-11-12 20:45:05 +00:00
Alexander Soare
bc3d4eb403 wip -rebase 2021-11-12 20:45:05 +00:00
Ross Wightman
2ddef942b9 Better fix for #954 that doesn't break torchscript, pull torch._assert into timm namespace when it exists 2021-11-02 11:22:33 -07:00
Ross Wightman
4f0f9cb348 Fix #954 by bringing traceable _assert into timm to allow compat w/ PyTorch < 1.8 2021-11-02 09:21:40 -07:00
Ross Wightman
b745d30a3e Fix formatting of last commit 2021-10-25 15:15:14 -07:00
Ross Wightman
3478f1d7f1 Traceability fix for vit models for some experiments 2021-10-25 15:13:08 -07:00
Ross Wightman
f658a72e72 Cleanup re-use of Dropout modules in Mlp modules after some twitter feedback :p 2021-10-25 00:40:59 -07:00
Ross Wightman
c02334d9fa Add weights for regnetz_d and haloregnetz_c, update regnetz_c weights. Add commented PyTorch XLA code for halo attention 2021-10-19 12:32:09 -07:00
Ross Wightman
02daf2ab94 Add option to include relative pos embedding in the attention scaling as per references. See discussion #912 2021-10-12 15:37:01 -07:00
Ross Wightman
e2b8d44ff0 Halo, bottleneck attn, lambda layer additions and cleanup along w/ experimental model defs
* align interfaces of halo, bottleneck attn and lambda layer
* add qk_ratio to all of above, control q/k dim relative to output dim
* add experimental haloregnetz, and trionet (lambda + halo + bottle) models
2021-10-06 16:32:48 -07:00
Ross Wightman
007bc39323 Some halo and bottleneck attn code cleanup, add halonet50ts weights, use optimal crop ratios 2021-10-02 15:51:42 -07:00
Ross Wightman
b1c2e3eb92 Match rel_pos_indices attr rename in conv branch 2021-09-30 23:19:05 -07:00
Ross Wightman
b49630a138 Add relative pos embed option to LambdaLayer, fix last transpose/reshape. 2021-09-30 22:45:09 -07:00
Ross Wightman
b81e79aae9 Fix bottleneck attn transpose typo, hopefully these train better now.. 2021-09-28 16:38:41 -07:00
Ross Wightman
515121cca1 Use reshape instead of view in std_conv, causing issues in recent PyTorch in channels_last 2021-09-23 15:43:48 -07:00
Ross Wightman
5bd04714e4 Cleanup weight init for byob/byoanet and related 2021-09-05 15:34:05 -07:00
Ross Wightman
8642401e88 Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low... 2021-09-05 15:17:19 -07:00
Ross Wightman
5f12de4875 Add initial AttentionPool2d that's being trialed. Fix comment and still trying to improve reliability of sgd test. 2021-09-05 12:41:14 -07:00
Ross Wightman
492c0a4e20 Update HaloAttn comment 2021-09-01 17:14:31 -07:00
Ross Wightman
3b9032ea48 Use Tensor.unfold().unfold() for HaloAttn, fast like as_strided but more clarity 2021-08-27 12:45:53 -07:00
Ross Wightman
8449ba210c Improve performance of HaloAttn, change default dim calc. Some cleanup / fixes for byoanet. Rename resnet26ts to tfs to distinguish (extra fc). 2021-08-26 21:56:44 -07:00
Ross Wightman
925e102982 Update attention / self-attn based models from a series of experiments:
* remove dud attention, involution + my swin attention adaptation don't seem worth keeping
* add or update several new 26/50 layer ResNe(X)t variants that were used in experiments
* remove models associated with dead-end or uninteresting experiment results
* weights coming soon...
2021-08-20 16:13:11 -07:00
Ross Wightman
01cb46a9a5 Add gc_efficientnetv2_rw_t weights (global context instead of SE attn). Add TF XL weights even though the fine-tuned ones don't validate that well. Change default arg for GlobalContext to use scal (mul) mode. 2021-08-07 16:45:29 -07:00
Ross Wightman
8165cacd82 Realized LayerNorm2d won't work in all cases as is, fixed. 2021-07-05 18:21:34 -07:00
Ross Wightman
b9cfb64412 Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load. 2021-06-14 12:31:44 -07:00
Ross Wightman
8319e0c373 Add file docstring to std_conv.py 2021-06-13 12:31:06 -07:00
Ross Wightman
4d96165989 Merge branch 'master' into cleanup_xla_model_fixes 2021-06-12 23:19:25 -07:00
Ross Wightman
8880f696b6 Refactoring, cleanup, improved test coverage.
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
2021-06-12 16:40:02 -07:00
Ross Wightman
ba2ca4b464 One codepath for stdconv, switch layernorm to batchnorm so gain included. Tweak epsilon values for nfnet, resnetv2, vit hybrid. 2021-06-12 12:27:43 -07:00
Ross Wightman
b7a568f065 Fix torchscript issue in bat 2021-06-08 23:19:51 -07:00
Ross Wightman
8e4ac3549f All ScaledStdConv and StdConv uses default to using F.layernorm so that they work with PyTorch XLA. eps value tweaking is a WIP. 2021-06-07 17:14:19 -07:00
Ross Wightman
bda8ab015a Remove min channels for SelectiveKernel, divisor should cover cases well enough. 2021-05-31 15:38:56 -07:00
Ross Wightman
a27f4aec4a Missed args for skresnext w/ refactoring. 2021-05-31 14:06:34 -07:00
Ross Wightman
307a935b79 Add non-local and BAT attention. Merge attn and self-attn factories into one. Add attention references to README. Add mlp 'mode' to ECA. 2021-05-31 13:18:11 -07:00
Ross Wightman
8bf63b6c6c Able to use other attn layer in EfficientNet now. Create test ECA + GC B0 configs. Make ECA more configurable. 2021-05-30 12:47:02 -07:00
Ross Wightman
9611458e19 Throw in some FBNetV3 code I had lying around, some refactoring of SE reduction channel calcs for all EffNet archs. 2021-05-28 20:47:24 -07:00
Ross Wightman
f615474be3 Fix broken test, repvgg block doesn't have attn_last attr. 2021-05-27 18:12:22 -07:00
Ross Wightman
742c2d5247 Add Gather-Excite and Global Context attn modules. Refactor existing SE-like attn for consistency and refactor byob/byoanet for less redundancy. 2021-05-27 18:03:29 -07:00
Ross Wightman
9c78de8c02 Fix #661, move hardswish out of default args for LeViT. Enable native torch support for hardswish, hardsigmoid, mish if present. 2021-05-26 15:28:42 -07:00
Ross Wightman
f45de37690 Merge branch 'master' into levit_visformer_rednet 2021-05-22 16:34:31 -07:00
Ross Wightman
d5af752117 Add preliminary gMLP and ResMLP impl to Mlp-Mixer 2021-05-19 09:55:05 -07:00
Ross Wightman
3bffc701f1 Merge branch 'master' into levit_visformer_rednet 2021-05-14 23:02:12 -07:00
Ross Wightman
ecc7552c5c Add levit, levit_c, and visformer model defs. Largely untested and not finished cleanup. 2021-05-14 17:16:34 -07:00
Ross Wightman
165fb354b2 Add initial RedNet model / Involution layer impl for testing 2021-05-14 17:16:34 -07:00
Ross Wightman
c4f482a08b EfficientNetV2 official impl w/ weights ported from TF. Cleanup/refactor of related EfficientNet classes and models. 2021-05-14 15:50:00 -07:00