mirror of
https://github.com/huggingface/pytorch-image-models.git
synced 2025-06-03 15:01:08 +08:00
Update README
This commit is contained in:
parent
dbd33e4b62
commit
9e327c9932
15
README.md
15
README.md
@ -24,6 +24,20 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
|
||||
* ❗Updates after Oct 10, 2022 are available in 0.8.x pre-releases (`pip install --pre timm`) or cloning main❗
|
||||
* Stable releases are 0.6.x and available by normal pip install or clone from [0.6.x](https://github.com/rwightman/pytorch-image-models/tree/0.6.x) branch.
|
||||
|
||||
### March 22, 2023
|
||||
* More weights pushed to HF hub along with multi-weight support, including: `regnet.py`, `rexnet.py`, `byobnet.py`, `resnetv2.py`, `swin_transformer.py`, `swin_transformer_v2.py`, `swin_transformer_v2_cr.py`
|
||||
* Swin Transformer models support feature extraction (NCHW feat maps for `swinv2_cr_*`, and NHWC for all others) and spatial embedding outputs.
|
||||
* FocalNet (from https://github.com/microsoft/FocalNet) models and weights added with significant refactoring, feature extraction, no fixed resolution / sizing constraint
|
||||
* RegNet weights increased with HF hub push, SWAG, SEER, and torchvision v2 weights. SEER is pretty poor wrt to performance for model size, but possibly useful.
|
||||
* More ImageNet-12k pretrained and 1k fine-tuned `timm` weights:
|
||||
* `rexnetr_200.sw_in12k_ft_in1k` - 82.6 @ 224, 83.2 @ 288
|
||||
* `rexnetr_300.sw_in12k_ft_in1k` - 84.0 @ 224, 84.5 @ 288
|
||||
* `regnety_120.sw_in12k_ft_in1k` - 85.0 @ 224, 85.4 @ 288
|
||||
* `regnety_160.lion_in12k_ft_in1k` - 85.6 @ 224, 86.0 @ 288
|
||||
* `regnety_160.sw_in12k_ft_in1k` - 85.6 @ 224, 86.0 @ 288 (compare to SWAG PT + 1k FT this is same BUT much lower res, blows SEER FT away)
|
||||
* Model name deprecation + remapping functionality added (a milestone for bringing 0.8.x out of pre-release). Mappings being added...
|
||||
* Minor bug fixes and improvements.
|
||||
|
||||
### Feb 26, 2023
|
||||
* Add ConvNeXt-XXLarge CLIP pretrained image tower weights for fine-tune & features (fine-tuning TBD) -- see [model card](https://huggingface.co/laion/CLIP-convnext_xxlarge-laion2B-s34B-b82K-augreg-soup)
|
||||
* Update `convnext_xxlarge` default LayerNorm eps to 1e-5 (for CLIP weights, improved stability)
|
||||
@ -478,6 +492,7 @@ All model architecture families include variants with pretrained weights. There
|
||||
* TinyNet - https://arxiv.org/abs/2010.14819
|
||||
* EVA - https://arxiv.org/abs/2211.07636
|
||||
* FlexiViT - https://arxiv.org/abs/2212.08013
|
||||
* FocalNet (Focal Modulation Networks) - https://arxiv.org/abs/2203.11926
|
||||
* GCViT (Global Context Vision Transformer) - https://arxiv.org/abs/2206.09959
|
||||
* GhostNet - https://arxiv.org/abs/1911.11907
|
||||
* gMLP - https://arxiv.org/abs/2105.08050
|
||||
|
Loading…
x
Reference in New Issue
Block a user