178 Commits

Author SHA1 Message Date
Ross Wightman
ea9c9550b2 Fully move ViT hybrids to their own file, including embedding module. Remove some extra DeiT models that were for benchmarking only. 2021-04-01 14:17:38 -07:00
Ross Wightman
a5310a3451 Merge remote-tracking branch 'origin/benchmark-fixes-vit_hybrids' into pit_and_vit_update 2021-04-01 12:15:34 -07:00
Ross Wightman
7953e5d11a Fix pos_embed scaling for ViT and num_classes != 1000 for pretrained distilled deit and pit models. Fix #426 and fix #433 2021-03-31 23:11:28 -07:00
Ross Wightman
a760a4c3f4 Some ViT cleanup, merge distilled model with main, fixup torchscript support for distilled models 2021-03-31 18:21:02 -07:00
Ross Wightman
cf5fec5047 Cleanup experimental vit weight init a bit 2021-03-20 09:44:24 -07:00
Ross Wightman
cbcb76d72c Should have included Conv2d layers in original weight init. Lets see what the impact is... 2021-03-18 23:15:48 -07:00
Ross Wightman
4de57ccf01 Add weight init scheme that's closer to JAX impl 2021-03-18 15:35:22 -07:00
Ross Wightman
45c048ba13 A few minor fixes and bit more cleanup on the huggingface hub integration. 2021-03-17 13:18:52 -07:00
Ross Wightman
d584e7f617 Support for huggingface hub via create_model and default_cfgs.
* improve consistency of model creation helper fns
* add comments to some of the model helpers
* support passing external default_cfgs so they can be sourced from hub
2021-03-16 22:48:26 -07:00
Ross Wightman
17cdee7354 Fix C&P patch_size error, and order of op patch_size arg resolution bug. Remove a test vit model. 2021-03-01 16:53:32 -08:00
Ross Wightman
0706d05d52 Benchmark models listed in txt file. Add more hybrid vit variants for testing 2021-02-28 16:00:33 -08:00
Ross Wightman
de97be9146 Spell out diff between my small and deit small vit models. 2021-02-23 16:22:55 -08:00
Ross Wightman
f0ffdf89b3 Add numerous experimental ViT Hybrid models w/ ResNetV2 base. Update the ViT naming for hybrids. Fix #426 for pretrained vit resizing. 2021-02-23 15:54:55 -08:00
Ross Wightman
5a8e1e643e Initial Normalizer-Free Reg/ResNet impl. A bit of related layer refactoring. 2021-01-27 22:06:57 -08:00
Ross Wightman
bb50ac4708 Add DeiT distilled weights and distilled model def. Remove some redudant ViT model args. 2021-01-25 11:05:23 -08:00
Ross Wightman
c16e965037 Add some ViT comments and fix a few minor issues. 2021-01-24 23:18:35 -08:00
Ross Wightman
55f7dfa9ea Refactor vision_transformer entrpy fns, add pos embedding resize support for fine tuning, add some deit models for testing 2021-01-18 16:11:02 -08:00
Ross Wightman
855d6cc217 More dataset work including factories and a tensorflow datasets (TFDS) wrapper
* Add parser/dataset factory methods for more flexible dataset & parser creation
* Add dataset parser that wraps TFDS image classification datasets
* Tweak num_classes handling bug for 21k models
* Add initial deit models so they can be benchmarked in next csv results runs
2021-01-15 17:26:20 -08:00
Ross Wightman
ce69de70d3 Add 21k weight urls to vision_transformer. Cleanup feature_info for preact ResNetV2 (BiT) models 2020-12-28 16:59:15 -08:00
Ross Wightman
231d04e91a ResNetV2 pre-act and non-preact model, w/ BiT pretrained weights and support for ViT R50 model. Tweaks for in21k num_classes passing. More to do... tests failing. 2020-12-28 16:59:15 -08:00
Ross Wightman
b401952caf Add newly added vision transformer large/base 224x224 weights ported from JAX official repo 2020-10-29 17:31:01 -07:00
Ross Wightman
61200db0ab in_chans=1 working w/ pretrained weights for vision_transformer 2020-10-29 15:49:36 -07:00
Ross Wightman
f591e90b0d Make sure num_features attr is present in vit models as with others 2020-10-29 15:33:47 -07:00
Ross Wightman
f944242cb0 Fix #262, num_classes arg mixup. Make vision_transformers a bit closer to other models wrt get/reset classfier/forward_features. Fix torchscript for ViT. 2020-10-29 13:58:28 -07:00
Ross Wightman
736f209e7d Update vision transformers to be compatible with official code. Port official ViT weights from jax impl. 2020-10-26 18:42:11 -07:00
Ross Wightman
27a93e9de7 Improve test crop for ViT models. Small now 77.85, added base weights at 79.35 top-1. 2020-10-21 23:35:25 -07:00
Ross Wightman
d4db9e7977 Add small vision transformer weights. 77.42 top-1. 2020-10-21 12:14:12 -07:00
Ross Wightman
f31933cb37 Initial Vision Transformer impl w/ patch and hybrid variants. Refactor tuple helpers. 2020-10-13 13:33:44 -07:00