Commit Graph

681 Commits (135a48d02454ed987223727e6bca5cb8c1223caa)

Author SHA1 Message Date
Ross Wightman f1808e0970 Post crossvit merge cleanup, change model names to reflect input size, cleanup img size vs scale handling, fix tests 2021-09-13 11:49:54 -07:00
Ross Wightman 4027412757 Add resnet33ts weights, update resnext26ts baseline weights 2021-09-09 14:46:41 -07:00
Richard Chen 9fe5798bee fix bug for reset classifier and fix for validating the dimension 2021-09-08 21:58:17 -04:00
Richard Chen 3718c5a5bd fix loading pretrained model 2021-09-08 11:53:05 -04:00
Richard Chen bb50b69a57 fix for torch script 2021-09-08 11:20:59 -04:00
Ross Wightman 5bd04714e4 Cleanup weight init for byob/byoanet and related 2021-09-05 15:34:05 -07:00
Ross Wightman 8642401e88 Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low... 2021-09-05 15:17:19 -07:00
Ross Wightman 5f12de4875 Add initial AttentionPool2d that's being trialed. Fix comment and still trying to improve reliability of sgd test. 2021-09-05 12:41:14 -07:00
Ross Wightman 76881d207b Add baseline resnet26t @ 256x256 weights. Add 33ts variant of halonet with at least one halo in stage 2,3,4 2021-09-04 14:52:54 -07:00
Ross Wightman 484e61648d Adding the attn series weights, tweaking model names, comments... 2021-09-03 18:09:42 -07:00
Ross Wightman fb94350896 Update training script and loader factory to allow use of scheduler updates, repeat augment, and bce loss 2021-09-01 17:46:40 -07:00
Ross Wightman f262137ff2 Add RepeatAugSampler as per DeiT RASampler impl, showing promise for current (distributed) training experiments. 2021-09-01 17:40:53 -07:00
Ross Wightman ba9c1108a1 Add a BCE loss impl that converts dense targets to sparse /w smoothing as an alternate to CE w/ smoothing. For training experiments. 2021-09-01 17:39:28 -07:00
Ross Wightman 29a37e23ee LR scheduler update:
* add polynomial decay 'poly'
* cleanup cycle specific args for cosine, poly, and tanh sched, t_mul -> cycle_mul, decay -> cycle_decay, default cycle_limit to 1 in each opt
* add k-decay for cosine and poly sched as per https://arxiv.org/abs/2004.05909
* change default tanh ub/lb to push inflection to later epochs
2021-09-01 17:33:11 -07:00
Ross Wightman 492c0a4e20 Update HaloAttn comment 2021-09-01 17:14:31 -07:00
Richard Chen 7ab9d4555c add crossvit 2021-09-01 17:13:12 -04:00
Ross Wightman 3b9032ea48 Use Tensor.unfold().unfold() for HaloAttn, fast like as_strided but more clarity 2021-08-27 12:45:53 -07:00
Ross Wightman 78933122c9 Fix silly typo 2021-08-27 09:22:20 -07:00
Ross Wightman 2568ffc5ef Merge branch 'master' into attn_update 2021-08-27 09:21:22 -07:00
Ross Wightman 708d87a813 Fix ViT SAM weight compat as weights at URL changed to not use repr layer. Fix #825. Tweak optim test. 2021-08-27 09:20:13 -07:00
Ross Wightman 8449ba210c Improve performance of HaloAttn, change default dim calc. Some cleanup / fixes for byoanet. Rename resnet26ts to tfs to distinguish (extra fc). 2021-08-26 21:56:44 -07:00
Ross Wightman a8b65695f1 Add resnet26ts and resnext26ts models for non-attn baselines 2021-08-21 12:42:10 -07:00
Ross Wightman a5a542f17d Fix typo 2021-08-20 17:47:23 -07:00
Ross Wightman 925e102982 Update attention / self-attn based models from a series of experiments:
* remove dud attention, involution + my swin attention adaptation don't seem worth keeping
* add or update several new 26/50 layer ResNe(X)t variants that were used in experiments
* remove models associated with dead-end or uninteresting experiment results
* weights coming soon...
2021-08-20 16:13:11 -07:00
Ross Wightman d667351eac Tweak accuracy topk safety. Fix #807 2021-08-19 14:18:53 -07:00
Yohann Lereclus 35c9740826 Fix accuracy when topk > num_classes 2021-08-19 11:58:59 +02:00
Ross Wightman a16a753852 Add lamb/lars to optim init imports, remove stray comment 2021-08-18 22:55:02 -07:00
Ross Wightman c207e02782 MOAR optimizer changes. Woo! 2021-08-18 22:20:35 -07:00
Ross Wightman a426511c95 More optimizer cleanup. Change all to no longer use .data. Improve (b)float16 use with adabelief. Add XLA compatible Lars. 2021-08-18 17:21:56 -07:00
Ross Wightman 9541f4963b One more scalar -> tensor fix for lamb optimizer 2021-08-18 11:20:25 -07:00
Ross Wightman 8f68193c91
Update lamp.py comment 2021-08-18 09:27:40 -07:00
Ross Wightman 4d284017b8
Merge pull request #813 from rwightman/opt_cleanup
Optimizer cleanup and additions
2021-08-18 09:12:00 -07:00
Ross Wightman a6af48be64 add madgradw optimizer 2021-08-17 22:19:27 -07:00
Ross Wightman 55fb5eedf6 Remove experiment from lamb impl 2021-08-17 21:48:26 -07:00
Ross Wightman 8a9eca5157 A few optimizer comments, dead import, missing import 2021-08-17 18:01:33 -07:00
Ross Wightman ac469b50da Optimizer improvements, additions, cleanup
* Add MADGRAD code
* Fix Lamb (non-fused variant) to work w/ PyTorch XLA
* Tweak optimizer factory args (lr/learning_rate and opt/optimizer_name), may break compat
* Use newer fn signatures for all add,addcdiv, addcmul in optimizers
* Use upcoming PyTorch native Nadam if it's available
* Cleanup lookahead opt
* Add optimizer tests
* Remove novograd.py impl as it was messy, keep nvnovograd
* Make AdamP/SGDP work in channels_last layout
* Add rectified adablief mode (radabelief)
* Support a few more PyTorch optim, adamax, adagrad
2021-08-17 17:51:20 -07:00
Sepehr Sameni abf3e044bb
Update scheduler_factory.py
remove duplicate code from create_scheduler()
2021-08-14 22:53:17 +02:00
Ross Wightman 3cdaf5ed56 Add `mmax` config key to auto_augment for increasing upper bound of RandAugment magnitude beyond 10. Make AugMix uniform sampling default not override config setting. 2021-08-12 15:39:05 -07:00
Ross Wightman 1042b8a146 Add non fused LAMB optimizer option 2021-08-09 13:13:43 -07:00
Ross Wightman 01cb46a9a5 Add gc_efficientnetv2_rw_t weights (global context instead of SE attn). Add TF XL weights even though the fine-tuned ones don't validate that well. Change default arg for GlobalContext to use scal (mul) mode. 2021-08-07 16:45:29 -07:00
Ross Wightman d3f7440650 Add EfficientNetV2 XL model defs 2021-07-22 13:15:24 -07:00
Ross Wightman 72b227dcf5
Merge pull request #750 from drjinying/master
Specify "interpolation" mode in vision_transformer's resize_pos_embed
2021-07-13 11:01:20 -07:00
Ross Wightman 2907c1f967
Merge pull request #746 from samarth4149/master
Adding a Multi Step LR Scheduler
2021-07-13 10:55:54 -07:00
Ross Wightman 748ab852ca Allow act_layer switch for xcit, fix in_chans for some variants 2021-07-12 13:27:29 -07:00
Ying Jin 20b2d4b69d Use bicubic interpolation in resize_pos_embed() 2021-07-12 10:38:31 -07:00
Ross Wightman d3255adf8e Merge branch 'xcit' of https://github.com/alexander-soare/pytorch-image-models into alexander-soare-xcit 2021-07-12 08:30:30 -07:00
Ross Wightman f8039c7492 Fix gc effv2 model cfg name 2021-07-11 12:14:31 -07:00
Alexander Soare 3a55a30ed1 add notes from author 2021-07-11 14:25:58 +01:00
Alexander Soare 899cf84ccc bug fix - missing _dist postfix for many of the 224_dist models 2021-07-11 12:41:51 +01:00
Alexander Soare 623e8b8eb8 wip xcit 2021-07-11 09:39:38 +01:00
Ross Wightman 392368e210 Add efficientnetv2_rw_t defs w/ weights, and gc variant, as well as gcresnet26ts for experiments. Version 0.4.13 2021-07-09 16:46:52 -07:00
samarth daab57a6d9 1. Added a simple multi step LR scheduler 2021-07-09 16:18:27 -04:00
Ross Wightman 6d8272e92c Add SAM pretrained model defs/weights for ViT B16 and B32 models. 2021-07-08 11:51:12 -07:00
Ross Wightman ee4d8fc69a Remove unecessary line from nest post refactor 2021-07-05 21:22:46 -07:00
Ross Wightman 8165cacd82 Realized LayerNorm2d won't work in all cases as is, fixed. 2021-07-05 18:21:34 -07:00
Ross Wightman 81cd6863c8 Move aggregation (convpool) for nest into NestLevel, cleanup and enable features_only use. Finalize weight url. 2021-07-05 18:20:49 -07:00
Ross Wightman 6ae0ac6420 Merge branch 'nested_transformer' of https://github.com/alexander-soare/pytorch-image-models into alexander-soare-nested_transformer 2021-07-03 12:45:26 -07:00
Alexander Soare 7b8a0017f1 wip to review 2021-07-03 12:10:12 +01:00
Alexander Soare b11d949a06 wip checkpoint with some feature extraction work 2021-07-03 11:45:19 +01:00
Alexander Soare 23bb72ce5e nested_transformer wip 2021-07-02 20:12:29 +01:00
Ross Wightman 766b4d3262 Fix features for resnetv2_50t 2021-06-28 15:56:24 -07:00
Ross Wightman e8045e712f Fix BatchNorm for ResNetV2 non GN models, add more ResNetV2 model defs for future experimentation, fix zero_init of last residual for pre-act. 2021-06-28 10:52:45 -07:00
Ross Wightman 20a2be14c3 Add gMLP-S weights, 79.6 top-1 2021-06-23 10:40:30 -07:00
Ross Wightman 85f894e03d Fix ViT in21k representation (pre_logits) layer handling across old and new npz checkpoints 2021-06-23 10:38:34 -07:00
Ross Wightman b41cffaa93 Fix a few issues loading pretrained vit/bit npz weights w/ num_classes=0 __init__ arg. Missed a few other small classifier handling detail on Mlp, GhostNet, Levit. Should fix #713 2021-06-22 23:16:05 -07:00
Ross Wightman 9c9755a808 AugReg release 2021-06-20 17:46:06 -07:00
Ross Wightman 381b279785 Add hybrid model fwds back 2021-06-19 22:28:44 -07:00
Ross Wightman 26f04a8e3e Fix a weight link 2021-06-19 16:39:36 -07:00
Ross Wightman 8f4a0222ed Add GMixer-24 MLP model weights, trained w/ TPU + PyTorch XLA 2021-06-18 16:49:28 -07:00
Ross Wightman 4c09a2f169 Bump version 0.4.12 2021-06-18 16:17:34 -07:00
Ross Wightman b319eb5b5d Update ViT weights, more details to be added before merge. 2021-06-18 16:16:49 -07:00
Ross Wightman 8257b86550 Fix up resnetv2 bit/bitm model default res 2021-06-18 16:16:06 -07:00
Ross Wightman 1228f5a3d8 Add BiT distilled 50x1 and teacher 152x2 models from 'A good teacher is patient and consistent' paper. 2021-06-18 11:40:33 -07:00
Ross Wightman 511a8e8c96 Add official ResMLP weights. 2021-06-14 17:03:16 -07:00
Ross Wightman b9cfb64412 Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load. 2021-06-14 12:31:44 -07:00
Ross Wightman 8319e0c373 Add file docstring to std_conv.py 2021-06-13 12:31:06 -07:00
Ross Wightman 4d96165989 Merge branch 'master' into cleanup_xla_model_fixes 2021-06-12 23:19:25 -07:00
Ross Wightman 8880f696b6 Refactoring, cleanup, improved test coverage.
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
2021-06-12 16:40:02 -07:00
Ross Wightman ba2ca4b464 One codepath for stdconv, switch layernorm to batchnorm so gain included. Tweak epsilon values for nfnet, resnetv2, vit hybrid. 2021-06-12 12:27:43 -07:00
Ross Wightman b7a568f065 Fix torchscript issue in bat 2021-06-08 23:19:51 -07:00
Ross Wightman d17b374f0f Minimum input_size needed to be higher 2021-06-08 21:31:39 -07:00
Ross Wightman b3b90d944d Add min_input_size to bat_resnext to prevent test breakage. 2021-06-08 17:32:08 -07:00
Ross Wightman d413eef1bf Add ResMLP-24 model weights that I trained in PyTorch XLA on TPU-VM. 79.2 top-1. 2021-06-08 14:22:05 -07:00
Ross Wightman 10d8fa4620 Add gc and bat attention resnext26ts variants to byob for test. 2021-06-08 14:21:07 -07:00
Ross Wightman 2f5ed2dec1 Update `init_values` const for 24 and 36 layer ResMLP models 2021-06-07 17:15:04 -07:00
Ross Wightman 8e4ac3549f All ScaledStdConv and StdConv uses default to using F.layernorm so that they work with PyTorch XLA. eps value tweaking is a WIP. 2021-06-07 17:14:19 -07:00
Ross Wightman 2a63d0246b Post merge cleanup 2021-06-07 14:38:30 -07:00
Ross Wightman 45dec179e5
Merge pull request #681 from lmk123568/master
Update convit.py
2021-06-07 14:10:53 -07:00
Dongyoon Han ded1671483 Fix stochastic depth working only with a shortcut 2021-06-07 23:08:55 +09:00
Mike b87d98b238
Update convit.py
Cut out the duplicates
2021-06-06 17:58:31 +08:00
Ross Wightman 02320c3e3d Bump version to 0.4.11 2021-05-31 15:41:51 -07:00
Ross Wightman bda8ab015a Remove min channels for SelectiveKernel, divisor should cover cases well enough. 2021-05-31 15:38:56 -07:00
Ross Wightman a27f4aec4a Missed args for skresnext w/ refactoring. 2021-05-31 14:06:34 -07:00
Ross Wightman 307a935b79 Add non-local and BAT attention. Merge attn and self-attn factories into one. Add attention references to README. Add mlp 'mode' to ECA. 2021-05-31 13:18:11 -07:00
Ross Wightman 8bf63b6c6c Able to use other attn layer in EfficientNet now. Create test ECA + GC B0 configs. Make ECA more configurable. 2021-05-30 12:47:02 -07:00
Ross Wightman bcec14d3b5 Bring EfficientNet SE layer in line with others, pull se_ratio outside of blocks. Allows swapping w/ other attn layers. 2021-05-29 23:41:38 -07:00
Ross Wightman 9611458e19 Throw in some FBNetV3 code I had lying around, some refactoring of SE reduction channel calcs for all EffNet archs. 2021-05-28 20:47:24 -07:00
Ross Wightman 01b9108619 Merge branch 'master' into more_attn 2021-05-28 11:09:37 -07:00
Ross Wightman d7bab8a6c5 Fix strict flag change for checkpoint load. 2021-05-28 09:54:50 -07:00
Ross Wightman 02f9d4bc34 Add weights for resnet51q model, add 61q def. 2021-05-28 09:53:16 -07:00