PaddleClas/docs/zh_CN/models/SwinTransformer.md

31 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# SwinTransformer
---
## 目录
* [1. 概述](#1)
* [2. 精度、FLOPS 和参数量](#2)
<a name='1'></a>
## 1. 概述
Swin Transformer 是一种新的视觉 Transformer 网络可以用作计算机视觉领域的通用骨干网路。SwinTransformer 由移动窗口shifted windows表示的层次 Transformer 结构组成。移动窗口将自注意计算限制在非重叠的局部窗口上,同时允许跨窗口连接,从而提高了网络性能。[论文地址](https://arxiv.org/abs/2103.14030)。
<a name='2'></a>
## 2. 精度、FLOPS 和参数量
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| SwinTransformer_tiny_patch4_window7_224 | 0.8069 | 0.9534 | 0.812 | 0.955 | 4.5 | 28 |
| SwinTransformer_small_patch4_window7_224 | 0.8275 | 0.9613 | 0.832 | 0.962 | 8.7 | 50 |
| SwinTransformer_base_patch4_window7_224 | 0.8300 | 0.9626 | 0.835 | 0.965 | 15.4 | 88 |
| SwinTransformer_base_patch4_window12_384 | 0.8439 | 0.9693 | 0.845 | 0.970 | 47.1 | 88 |
| SwinTransformer_base_patch4_window7_224<sup>[1]</sup> | 0.8487 | 0.9746 | 0.852 | 0.975 | 15.4 | 88 |
| SwinTransformer_base_patch4_window12_384<sup>[1]</sup> | 0.8642 | 0.9807 | 0.864 | 0.980 | 47.1 | 88 |
| SwinTransformer_large_patch4_window7_224<sup>[1]</sup> | 0.8596 | 0.9783 | 0.863 | 0.979 | 34.5 | 197 |
| SwinTransformer_large_patch4_window12_384<sup>[1]</sup> | 0.8719 | 0.9823 | 0.873 | 0.982 | 103.9 | 197 |
[1]:基于 ImageNet22k 数据集预训练,然后在 ImageNet1k 数据集迁移学习得到。
**注**:与 Reference 的精度差异源于数据预处理不同。