Commit Graph

13 Commits (27c42f0830afab4b2ff40b948cf612328ed26680)

Author SHA1 Message Date
Ross Wightman 5f47518f27 Fix pit implementation to be clsoer to deit/levit re distillation head handling 2022-03-21 11:12:14 -07:00
Ross Wightman 0862e6ebae Fix correctness of some group matching regex (no impact on result), some formatting, missed forward_head for resnet 2022-03-19 14:58:54 -07:00
Ross Wightman 372ad5fa0d Significant model refactor and additions:
* All models updated with revised foward_features / forward_head interface
* Vision transformer and MLP based models consistently output sequence from forward_features (pooling or token selection considered part of 'head')
* WIP param grouping interface to allow consistent grouping of parameters for layer-wise decay across all model types
* Add gradient checkpointing support to a significant % of models, especially popular architectures
* Formatting and interface consistency improvements across models
* layer-wise LR decay impl part of optimizer factory w/ scale support in scheduler
* Poolformer and Volo architectures added
2022-02-28 13:56:23 -08:00
Ross Wightman 5f81d4de23 Move DeiT to own file, vit getting crowded. Working towards fixing #1029, make pooling interface for transformers and mlp closer to convnets. Still working through some details... 2022-01-26 22:53:57 -08:00
Ross Wightman 95cfc9b3e8 Merge remote-tracking branch 'origin/master' into norm_norm_norm 2022-01-25 22:20:45 -08:00
Ross Wightman abc9ba2544 Transitioning default_cfg -> pretrained_cfg. Improving handling of pretrained_cfg source (HF-Hub, files, timm config, etc). Checkpoint handling tweaks. 2022-01-25 21:54:13 -08:00
Ross Wightman 656757d26b Fix MobileNetV2 head conv size for multiplier < 1.0. Add some missing modification copyrights, fix starting date of some old ones. 2022-01-14 16:28:27 -08:00
Ross Wightman b41cffaa93 Fix a few issues loading pretrained vit/bit npz weights w/ num_classes=0 __init__ arg. Missed a few other small classifier handling detail on Mlp, GhostNet, Levit. Should fix #713 2021-06-22 23:16:05 -07:00
Ross Wightman 8880f696b6 Refactoring, cleanup, improved test coverage.
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
2021-06-12 16:40:02 -07:00
Ross Wightman 9c78de8c02 Fix #661, move hardswish out of default args for LeViT. Enable native torch support for hardswish, hardsigmoid, mish if present. 2021-05-26 15:28:42 -07:00
Ross Wightman 11ae795e99 Redo LeViT attention bias caching in a way that works with both torchscript and DataParallel 2021-05-25 10:15:32 -07:00
Ross Wightman bfc72f75d3 Expand scope of testing for non-std vision transformer / mlp models. Some related cleanup and create fn cleanup for all vision transformer and mlp models. More CoaT weights. 2021-05-24 21:13:26 -07:00
Ross Wightman ecc7552c5c Add levit, levit_c, and visformer model defs. Largely untested and not finished cleanup. 2021-05-14 17:16:34 -07:00