Ross Wightman
|
d5375ca769
|
Use torch F.rms_norm when possible, select fast vs normal paths appropriately and test with torchscript
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
15406a939e
|
Fixing RmsNorm to fix #2380 and noticed with aimv2 when comparing outputs. Still some work to do, need to look at AMP / fast mode behaviour, dispatch to torch when possible. Add SimpleNorm for 'LayerNorm w/o centering and bias'
|
2024-12-30 19:24:21 -08:00 |
Ross Wightman
|
5d7bd2973e
|
convnext zepto, rmsnorm experiments
|
2024-09-30 11:43:23 -07:00 |
Ross Wightman
|
2bfa5e5d74
|
Remove JIT activations, take jit out of ME activations. Remove other instances of torch.jit.script. Breaks torch.compile and is much less performant. Remove SpaceToDepthModule
|
2024-05-06 16:32:49 -07:00 |
Ross Wightman
|
f77c04ff36
|
Torchscript fixes/hacks for rms_norm, refactor ParallelScalingBlock with manual combination of input projections, closer paper match
|
2023-02-16 16:57:42 -08:00 |
Ross Wightman
|
621e1b2182
|
Add ideas from 'Scaling ViT to 22-B Params', testing PyTorch 2.0 fused F.scaled_dot_product_attention impl in vit, vit_relpos, maxxvit / coatnet.
|
2023-02-16 16:57:42 -08:00 |
Ross Wightman
|
927f031293
|
Major module / path restructure, timm.models.layers -> timm.layers, add _ prefix to all non model modules in timm.models
|
2022-12-06 15:00:06 -08:00 |