Commit Graph

653 Commits (cc9bedf373209664854dc400cbe5801e3fc1e6e9)

Author SHA1 Message Date
Ross Wightman a426511c95 More optimizer cleanup. Change all to no longer use .data. Improve (b)float16 use with adabelief. Add XLA compatible Lars. 2021-08-18 17:21:56 -07:00
Ross Wightman 9541f4963b One more scalar -> tensor fix for lamb optimizer 2021-08-18 11:20:25 -07:00
Ross Wightman 8f68193c91
Update lamp.py comment 2021-08-18 09:27:40 -07:00
Ross Wightman 4d284017b8
Merge pull request #813 from rwightman/opt_cleanup
Optimizer cleanup and additions
2021-08-18 09:12:00 -07:00
Ross Wightman a6af48be64 add madgradw optimizer 2021-08-17 22:19:27 -07:00
Ross Wightman 55fb5eedf6 Remove experiment from lamb impl 2021-08-17 21:48:26 -07:00
Ross Wightman 8a9eca5157 A few optimizer comments, dead import, missing import 2021-08-17 18:01:33 -07:00
Ross Wightman ac469b50da Optimizer improvements, additions, cleanup
* Add MADGRAD code
* Fix Lamb (non-fused variant) to work w/ PyTorch XLA
* Tweak optimizer factory args (lr/learning_rate and opt/optimizer_name), may break compat
* Use newer fn signatures for all add,addcdiv, addcmul in optimizers
* Use upcoming PyTorch native Nadam if it's available
* Cleanup lookahead opt
* Add optimizer tests
* Remove novograd.py impl as it was messy, keep nvnovograd
* Make AdamP/SGDP work in channels_last layout
* Add rectified adablief mode (radabelief)
* Support a few more PyTorch optim, adamax, adagrad
2021-08-17 17:51:20 -07:00
Sepehr Sameni abf3e044bb
Update scheduler_factory.py
remove duplicate code from create_scheduler()
2021-08-14 22:53:17 +02:00
Ross Wightman 3cdaf5ed56 Add `mmax` config key to auto_augment for increasing upper bound of RandAugment magnitude beyond 10. Make AugMix uniform sampling default not override config setting. 2021-08-12 15:39:05 -07:00
Ross Wightman 1042b8a146 Add non fused LAMB optimizer option 2021-08-09 13:13:43 -07:00
Ross Wightman 01cb46a9a5 Add gc_efficientnetv2_rw_t weights (global context instead of SE attn). Add TF XL weights even though the fine-tuned ones don't validate that well. Change default arg for GlobalContext to use scal (mul) mode. 2021-08-07 16:45:29 -07:00
Ross Wightman d3f7440650 Add EfficientNetV2 XL model defs 2021-07-22 13:15:24 -07:00
Ross Wightman 72b227dcf5
Merge pull request #750 from drjinying/master
Specify "interpolation" mode in vision_transformer's resize_pos_embed
2021-07-13 11:01:20 -07:00
Ross Wightman 2907c1f967
Merge pull request #746 from samarth4149/master
Adding a Multi Step LR Scheduler
2021-07-13 10:55:54 -07:00
Ross Wightman 748ab852ca Allow act_layer switch for xcit, fix in_chans for some variants 2021-07-12 13:27:29 -07:00
Ying Jin 20b2d4b69d Use bicubic interpolation in resize_pos_embed() 2021-07-12 10:38:31 -07:00
Ross Wightman d3255adf8e Merge branch 'xcit' of https://github.com/alexander-soare/pytorch-image-models into alexander-soare-xcit 2021-07-12 08:30:30 -07:00
Ross Wightman f8039c7492 Fix gc effv2 model cfg name 2021-07-11 12:14:31 -07:00
Alexander Soare 3a55a30ed1 add notes from author 2021-07-11 14:25:58 +01:00
Alexander Soare 899cf84ccc bug fix - missing _dist postfix for many of the 224_dist models 2021-07-11 12:41:51 +01:00
Alexander Soare 623e8b8eb8 wip xcit 2021-07-11 09:39:38 +01:00
Ross Wightman 392368e210 Add efficientnetv2_rw_t defs w/ weights, and gc variant, as well as gcresnet26ts for experiments. Version 0.4.13 2021-07-09 16:46:52 -07:00
samarth daab57a6d9 1. Added a simple multi step LR scheduler 2021-07-09 16:18:27 -04:00
Ross Wightman 6d8272e92c Add SAM pretrained model defs/weights for ViT B16 and B32 models. 2021-07-08 11:51:12 -07:00
Ross Wightman ee4d8fc69a Remove unecessary line from nest post refactor 2021-07-05 21:22:46 -07:00
Ross Wightman 8165cacd82 Realized LayerNorm2d won't work in all cases as is, fixed. 2021-07-05 18:21:34 -07:00
Ross Wightman 81cd6863c8 Move aggregation (convpool) for nest into NestLevel, cleanup and enable features_only use. Finalize weight url. 2021-07-05 18:20:49 -07:00
Ross Wightman 6ae0ac6420 Merge branch 'nested_transformer' of https://github.com/alexander-soare/pytorch-image-models into alexander-soare-nested_transformer 2021-07-03 12:45:26 -07:00
Alexander Soare 7b8a0017f1 wip to review 2021-07-03 12:10:12 +01:00
Alexander Soare b11d949a06 wip checkpoint with some feature extraction work 2021-07-03 11:45:19 +01:00
Alexander Soare 23bb72ce5e nested_transformer wip 2021-07-02 20:12:29 +01:00
Ross Wightman 766b4d3262 Fix features for resnetv2_50t 2021-06-28 15:56:24 -07:00
Ross Wightman e8045e712f Fix BatchNorm for ResNetV2 non GN models, add more ResNetV2 model defs for future experimentation, fix zero_init of last residual for pre-act. 2021-06-28 10:52:45 -07:00
Ross Wightman 20a2be14c3 Add gMLP-S weights, 79.6 top-1 2021-06-23 10:40:30 -07:00
Ross Wightman 85f894e03d Fix ViT in21k representation (pre_logits) layer handling across old and new npz checkpoints 2021-06-23 10:38:34 -07:00
Ross Wightman b41cffaa93 Fix a few issues loading pretrained vit/bit npz weights w/ num_classes=0 __init__ arg. Missed a few other small classifier handling detail on Mlp, GhostNet, Levit. Should fix #713 2021-06-22 23:16:05 -07:00
Ross Wightman 9c9755a808 AugReg release 2021-06-20 17:46:06 -07:00
Ross Wightman 381b279785 Add hybrid model fwds back 2021-06-19 22:28:44 -07:00
Ross Wightman 26f04a8e3e Fix a weight link 2021-06-19 16:39:36 -07:00
Ross Wightman 8f4a0222ed Add GMixer-24 MLP model weights, trained w/ TPU + PyTorch XLA 2021-06-18 16:49:28 -07:00
Ross Wightman 4c09a2f169 Bump version 0.4.12 2021-06-18 16:17:34 -07:00
Ross Wightman b319eb5b5d Update ViT weights, more details to be added before merge. 2021-06-18 16:16:49 -07:00
Ross Wightman 8257b86550 Fix up resnetv2 bit/bitm model default res 2021-06-18 16:16:06 -07:00
Ross Wightman 1228f5a3d8 Add BiT distilled 50x1 and teacher 152x2 models from 'A good teacher is patient and consistent' paper. 2021-06-18 11:40:33 -07:00
Ross Wightman 511a8e8c96 Add official ResMLP weights. 2021-06-14 17:03:16 -07:00
Ross Wightman b9cfb64412 Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load. 2021-06-14 12:31:44 -07:00
Ross Wightman 8319e0c373 Add file docstring to std_conv.py 2021-06-13 12:31:06 -07:00
Ross Wightman 4d96165989 Merge branch 'master' into cleanup_xla_model_fixes 2021-06-12 23:19:25 -07:00
Ross Wightman 8880f696b6 Refactoring, cleanup, improved test coverage.
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
2021-06-12 16:40:02 -07:00
Ross Wightman ba2ca4b464 One codepath for stdconv, switch layernorm to batchnorm so gain included. Tweak epsilon values for nfnet, resnetv2, vit hybrid. 2021-06-12 12:27:43 -07:00
Ross Wightman b7a568f065 Fix torchscript issue in bat 2021-06-08 23:19:51 -07:00
Ross Wightman d17b374f0f Minimum input_size needed to be higher 2021-06-08 21:31:39 -07:00
Ross Wightman b3b90d944d Add min_input_size to bat_resnext to prevent test breakage. 2021-06-08 17:32:08 -07:00
Ross Wightman d413eef1bf Add ResMLP-24 model weights that I trained in PyTorch XLA on TPU-VM. 79.2 top-1. 2021-06-08 14:22:05 -07:00
Ross Wightman 10d8fa4620 Add gc and bat attention resnext26ts variants to byob for test. 2021-06-08 14:21:07 -07:00
Ross Wightman 2f5ed2dec1 Update `init_values` const for 24 and 36 layer ResMLP models 2021-06-07 17:15:04 -07:00
Ross Wightman 8e4ac3549f All ScaledStdConv and StdConv uses default to using F.layernorm so that they work with PyTorch XLA. eps value tweaking is a WIP. 2021-06-07 17:14:19 -07:00
Ross Wightman 2a63d0246b Post merge cleanup 2021-06-07 14:38:30 -07:00
Ross Wightman 45dec179e5
Merge pull request #681 from lmk123568/master
Update convit.py
2021-06-07 14:10:53 -07:00
Dongyoon Han ded1671483 Fix stochastic depth working only with a shortcut 2021-06-07 23:08:55 +09:00
Mike b87d98b238
Update convit.py
Cut out the duplicates
2021-06-06 17:58:31 +08:00
Ross Wightman 02320c3e3d Bump version to 0.4.11 2021-05-31 15:41:51 -07:00
Ross Wightman bda8ab015a Remove min channels for SelectiveKernel, divisor should cover cases well enough. 2021-05-31 15:38:56 -07:00
Ross Wightman a27f4aec4a Missed args for skresnext w/ refactoring. 2021-05-31 14:06:34 -07:00
Ross Wightman 307a935b79 Add non-local and BAT attention. Merge attn and self-attn factories into one. Add attention references to README. Add mlp 'mode' to ECA. 2021-05-31 13:18:11 -07:00
Ross Wightman 8bf63b6c6c Able to use other attn layer in EfficientNet now. Create test ECA + GC B0 configs. Make ECA more configurable. 2021-05-30 12:47:02 -07:00
Ross Wightman bcec14d3b5 Bring EfficientNet SE layer in line with others, pull se_ratio outside of blocks. Allows swapping w/ other attn layers. 2021-05-29 23:41:38 -07:00
Ross Wightman 9611458e19 Throw in some FBNetV3 code I had lying around, some refactoring of SE reduction channel calcs for all EffNet archs. 2021-05-28 20:47:24 -07:00
Ross Wightman 01b9108619 Merge branch 'master' into more_attn 2021-05-28 11:09:37 -07:00
Ross Wightman d7bab8a6c5 Fix strict flag change for checkpoint load. 2021-05-28 09:54:50 -07:00
Ross Wightman 02f9d4bc34 Add weights for resnet51q model, add 61q def. 2021-05-28 09:53:16 -07:00
Ross Wightman f615474be3 Fix broken test, repvgg block doesn't have attn_last attr. 2021-05-27 18:12:22 -07:00
Ross Wightman 742c2d5247 Add Gather-Excite and Global Context attn modules. Refactor existing SE-like attn for consistency and refactor byob/byoanet for less redundancy. 2021-05-27 18:03:29 -07:00
Ross Wightman 9c78de8c02 Fix #661, move hardswish out of default args for LeViT. Enable native torch support for hardswish, hardsigmoid, mish if present. 2021-05-26 15:28:42 -07:00
Ross Wightman 5db7452173 Fix visformer in_chans stem handling 2021-05-25 14:11:36 -07:00
Ross Wightman 318360c3f9 Update README.md before merge. Bump version to 0.4.10 2021-05-25 12:26:16 -07:00
Ross Wightman 11ae795e99 Redo LeViT attention bias caching in a way that works with both torchscript and DataParallel 2021-05-25 10:15:32 -07:00
Ross Wightman d400f1dbdd Filter test models before creation for backward/torchscript tests 2021-05-25 10:14:45 -07:00
Ross Wightman c4572cc5aa Add Visformer-small weighs, tweak torchscript jit test img size. 2021-05-24 22:50:12 -07:00
Ross Wightman bfc72f75d3 Expand scope of testing for non-std vision transformer / mlp models. Some related cleanup and create fn cleanup for all vision transformer and mlp models. More CoaT weights. 2021-05-24 21:13:26 -07:00
Ross Wightman 18bf520ad1 Add eca_nfnet_l2/l3 defs for future training 2021-05-22 21:55:37 -07:00
Ross Wightman f45de37690 Merge branch 'master' into levit_visformer_rednet 2021-05-22 16:34:31 -07:00
Ross Wightman 23c18a33e4 Add efficientnetv2_rw_m weights trained in PyTorch. 84.8 top-1 @ 416 test. 53M params. 2021-05-21 21:16:25 -07:00
Ross Wightman c2ba229d99 Prep for effcientnetv2_rw_m model weights that started training before official release.. 2021-05-21 17:47:49 -07:00
Ross Wightman 30b9880d06 Minor adjustment, mutable default arg, extra check of valid len... 2021-05-21 17:20:51 -07:00
Ross Wightman be0abfbcce Merge branch 'master' of https://github.com/alexander-soare/pytorch-image-models into alexander-soare-master 2021-05-21 17:10:11 -07:00
Ross Wightman b7de82e835 ConViT cleanup, fix torchscript, bit of reformatting, reuse existing layers. 2021-05-21 17:04:23 -07:00
Ross Wightman 306c86b668 Merge branch 'convit' of https://github.com/amaarora/pytorch-image-models into amaarora-convit 2021-05-21 16:27:10 -07:00
Ross Wightman a569635045 Update twin weights to a copy in GitHub releases for faster dl. Tweak model class comment. 2021-05-21 16:23:14 -07:00
Ross Wightman be99eef9c1 Remove redundant code, cleanup, fix torchscript. 2021-05-20 23:38:35 -07:00
Ross Wightman 5ab372a3ec Merge branch 'master' of https://github.com/abcdvzz/pytorch-image-models into abcdvzz-master 2021-05-20 23:37:50 -07:00
Aman Arora 5db1eb6ba5 Add defaults 2021-05-21 02:11:20 +00:00
Aman Arora 8b1f2e8e1f remote unused matplotlib import 2021-05-20 23:42:42 +00:00
Aman Arora 40c506ba1e Add ConViT 2021-05-20 23:17:28 +00:00
Alexander Soare 7976019864 extend positional embedding resizing functionality to tnt 2021-05-20 11:55:48 +01:00
Alexander Soare 8086943b6f allow resize positional embeddings to non-square grid 2021-05-20 11:27:58 +01:00
talrid dc1a4efd28 mixer_b16_224_miil, mixer_b16_224_miil_in21k models 2021-05-20 10:35:50 +03:00
李鑫杰 7b799c4e79 add latest code 2021-05-20 11:15:49 +08:00
Ross Wightman d5af752117 Add preliminary gMLP and ResMLP impl to Mlp-Mixer 2021-05-19 09:55:05 -07:00