Commit Graph

140 Commits (15ef108eb4f92bc6a11597e7059d27f0fcd7a762)

Author SHA1 Message Date
Ross Wightman 93cc08fdc5 Make evonorm variables 1d to match other PyTorch norm layers, will break weight compat for any existing use (likely minimal, easy to fix). 2021-11-20 15:50:51 -08:00
Ross Wightman af607b75cc Prep a set of ResNetV2 models with GroupNorm, EvoNormB0, EvoNormS0 for BN free model experiments on TPU and IPU 2021-11-19 17:37:00 -08:00
Ross Wightman c976a410d9 Add ResNet-50 w/ GN (resnet50_gn) and SEBotNet-33-TS (sebotnet33ts_256) model defs and weights. Update halonet50ts weights w/ slightly better variant in1k val, more robust to test sets. 2021-11-19 14:24:43 -08:00
Alexander Soare b25ff96768 wip - pre-rebase 2021-11-12 20:45:05 +00:00
Alexander Soare e051dce354 Make all models FX traceable 2021-11-12 20:45:05 +00:00
Alexander Soare 0149ec30d7 wip - attempting to rebase 2021-11-12 20:45:05 +00:00
Alexander Soare bc3d4eb403 wip -rebase 2021-11-12 20:45:05 +00:00
Ross Wightman 2ddef942b9 Better fix for #954 that doesn't break torchscript, pull torch._assert into timm namespace when it exists 2021-11-02 11:22:33 -07:00
Ross Wightman 4f0f9cb348 Fix #954 by bringing traceable _assert into timm to allow compat w/ PyTorch < 1.8 2021-11-02 09:21:40 -07:00
Ross Wightman b745d30a3e Fix formatting of last commit 2021-10-25 15:15:14 -07:00
Ross Wightman 3478f1d7f1 Traceability fix for vit models for some experiments 2021-10-25 15:13:08 -07:00
Ross Wightman f658a72e72 Cleanup re-use of Dropout modules in Mlp modules after some twitter feedback :p 2021-10-25 00:40:59 -07:00
Ross Wightman c02334d9fa Add weights for regnetz_d and haloregnetz_c, update regnetz_c weights. Add commented PyTorch XLA code for halo attention 2021-10-19 12:32:09 -07:00
Ross Wightman 02daf2ab94 Add option to include relative pos embedding in the attention scaling as per references. See discussion #912 2021-10-12 15:37:01 -07:00
Ross Wightman e2b8d44ff0 Halo, bottleneck attn, lambda layer additions and cleanup along w/ experimental model defs
* align interfaces of halo, bottleneck attn and lambda layer
* add qk_ratio to all of above, control q/k dim relative to output dim
* add experimental haloregnetz, and trionet (lambda + halo + bottle) models
2021-10-06 16:32:48 -07:00
Ross Wightman 007bc39323 Some halo and bottleneck attn code cleanup, add halonet50ts weights, use optimal crop ratios 2021-10-02 15:51:42 -07:00
Ross Wightman b1c2e3eb92 Match rel_pos_indices attr rename in conv branch 2021-09-30 23:19:05 -07:00
Ross Wightman b49630a138 Add relative pos embed option to LambdaLayer, fix last transpose/reshape. 2021-09-30 22:45:09 -07:00
Ross Wightman b81e79aae9 Fix bottleneck attn transpose typo, hopefully these train better now.. 2021-09-28 16:38:41 -07:00
Ross Wightman 515121cca1 Use reshape instead of view in std_conv, causing issues in recent PyTorch in channels_last 2021-09-23 15:43:48 -07:00
Ross Wightman 5bd04714e4 Cleanup weight init for byob/byoanet and related 2021-09-05 15:34:05 -07:00
Ross Wightman 8642401e88 Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low... 2021-09-05 15:17:19 -07:00
Ross Wightman 5f12de4875 Add initial AttentionPool2d that's being trialed. Fix comment and still trying to improve reliability of sgd test. 2021-09-05 12:41:14 -07:00
Ross Wightman 492c0a4e20 Update HaloAttn comment 2021-09-01 17:14:31 -07:00
Ross Wightman 3b9032ea48 Use Tensor.unfold().unfold() for HaloAttn, fast like as_strided but more clarity 2021-08-27 12:45:53 -07:00
Ross Wightman 8449ba210c Improve performance of HaloAttn, change default dim calc. Some cleanup / fixes for byoanet. Rename resnet26ts to tfs to distinguish (extra fc). 2021-08-26 21:56:44 -07:00
Ross Wightman 925e102982 Update attention / self-attn based models from a series of experiments:
* remove dud attention, involution + my swin attention adaptation don't seem worth keeping
* add or update several new 26/50 layer ResNe(X)t variants that were used in experiments
* remove models associated with dead-end or uninteresting experiment results
* weights coming soon...
2021-08-20 16:13:11 -07:00
Ross Wightman 01cb46a9a5 Add gc_efficientnetv2_rw_t weights (global context instead of SE attn). Add TF XL weights even though the fine-tuned ones don't validate that well. Change default arg for GlobalContext to use scal (mul) mode. 2021-08-07 16:45:29 -07:00
Ross Wightman 8165cacd82 Realized LayerNorm2d won't work in all cases as is, fixed. 2021-07-05 18:21:34 -07:00
Ross Wightman b9cfb64412 Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load. 2021-06-14 12:31:44 -07:00
Ross Wightman 8319e0c373 Add file docstring to std_conv.py 2021-06-13 12:31:06 -07:00
Ross Wightman 4d96165989 Merge branch 'master' into cleanup_xla_model_fixes 2021-06-12 23:19:25 -07:00
Ross Wightman 8880f696b6 Refactoring, cleanup, improved test coverage.
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
2021-06-12 16:40:02 -07:00
Ross Wightman ba2ca4b464 One codepath for stdconv, switch layernorm to batchnorm so gain included. Tweak epsilon values for nfnet, resnetv2, vit hybrid. 2021-06-12 12:27:43 -07:00
Ross Wightman b7a568f065 Fix torchscript issue in bat 2021-06-08 23:19:51 -07:00
Ross Wightman 8e4ac3549f All ScaledStdConv and StdConv uses default to using F.layernorm so that they work with PyTorch XLA. eps value tweaking is a WIP. 2021-06-07 17:14:19 -07:00
Ross Wightman bda8ab015a Remove min channels for SelectiveKernel, divisor should cover cases well enough. 2021-05-31 15:38:56 -07:00
Ross Wightman a27f4aec4a Missed args for skresnext w/ refactoring. 2021-05-31 14:06:34 -07:00
Ross Wightman 307a935b79 Add non-local and BAT attention. Merge attn and self-attn factories into one. Add attention references to README. Add mlp 'mode' to ECA. 2021-05-31 13:18:11 -07:00
Ross Wightman 8bf63b6c6c Able to use other attn layer in EfficientNet now. Create test ECA + GC B0 configs. Make ECA more configurable. 2021-05-30 12:47:02 -07:00
Ross Wightman 9611458e19 Throw in some FBNetV3 code I had lying around, some refactoring of SE reduction channel calcs for all EffNet archs. 2021-05-28 20:47:24 -07:00
Ross Wightman f615474be3 Fix broken test, repvgg block doesn't have attn_last attr. 2021-05-27 18:12:22 -07:00
Ross Wightman 742c2d5247 Add Gather-Excite and Global Context attn modules. Refactor existing SE-like attn for consistency and refactor byob/byoanet for less redundancy. 2021-05-27 18:03:29 -07:00
Ross Wightman 9c78de8c02 Fix #661, move hardswish out of default args for LeViT. Enable native torch support for hardswish, hardsigmoid, mish if present. 2021-05-26 15:28:42 -07:00
Ross Wightman f45de37690 Merge branch 'master' into levit_visformer_rednet 2021-05-22 16:34:31 -07:00
Ross Wightman d5af752117 Add preliminary gMLP and ResMLP impl to Mlp-Mixer 2021-05-19 09:55:05 -07:00
Ross Wightman 3bffc701f1 Merge branch 'master' into levit_visformer_rednet 2021-05-14 23:02:12 -07:00
Ross Wightman ecc7552c5c Add levit, levit_c, and visformer model defs. Largely untested and not finished cleanup. 2021-05-14 17:16:34 -07:00
Ross Wightman 165fb354b2 Add initial RedNet model / Involution layer impl for testing 2021-05-14 17:16:34 -07:00
Ross Wightman c4f482a08b EfficientNetV2 official impl w/ weights ported from TF. Cleanup/refactor of related EfficientNet classes and models. 2021-05-14 15:50:00 -07:00