fix: fix @ -> matmul
parent
5f3d50f7d4
commit
2f666cdfc9
|
@ -3,9 +3,9 @@
|
||||||
## Overview
|
## Overview
|
||||||
The Twins network includes Twins-PCPVT and Twins-SVT, which focuses on the meticulous design of the spatial attention mechanism, resulting in a simple but more effective solution. Since the architecture only involves matrix multiplication, and the current deep learning framework has a high degree of optimization for matrix multiplication, the architecture is very efficient and easy to implement. Moreover, this architecture can achieve excellent performance in a variety of downstream vision tasks such as image classification, target detection, and semantic segmentation. [Paper](https://arxiv.org/abs/2104.13840).
|
The Twins network includes Twins-PCPVT and Twins-SVT, which focuses on the meticulous design of the spatial attention mechanism, resulting in a simple but more effective solution. Since the architecture only involves matrix multiplication, and the current deep learning framework has a high degree of optimization for matrix multiplication, the architecture is very efficient and easy to implement. Moreover, this architecture can achieve excellent performance in a variety of downstream vision tasks such as image classification, target detection, and semantic segmentation. [Paper](https://arxiv.org/abs/2104.13840).
|
||||||
|
|
||||||
## Accuracy, FLOPS and Parameters
|
## Accuracy, FLOPs and Parameters
|
||||||
|
|
||||||
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) | Params<br>(M) |
|
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|
||||||
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|
||||||
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
|
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
|
||||||
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
|
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
|
||||||
|
|
|
@ -3,9 +3,9 @@
|
||||||
## 概述
|
## 概述
|
||||||
Twins网络包括Twins-PCPVT和Twins-SVT,其重点对空间注意力机制进行了精心设计,得到了简单却更为有效的方案。由于该体系结构仅涉及矩阵乘法,而目前的深度学习框架中对矩阵乘法有较高的优化程度,因此该体系结构十分高效且易于实现。并且,该体系结构在图像分类、目标检测和语义分割等多种下游视觉任务中都能够取得优异的性能。[论文地址](https://arxiv.org/abs/2104.13840)。
|
Twins网络包括Twins-PCPVT和Twins-SVT,其重点对空间注意力机制进行了精心设计,得到了简单却更为有效的方案。由于该体系结构仅涉及矩阵乘法,而目前的深度学习框架中对矩阵乘法有较高的优化程度,因此该体系结构十分高效且易于实现。并且,该体系结构在图像分类、目标检测和语义分割等多种下游视觉任务中都能够取得优异的性能。[论文地址](https://arxiv.org/abs/2104.13840)。
|
||||||
|
|
||||||
## 精度、FLOPS和参数量
|
## 精度、FLOPs和参数量
|
||||||
|
|
||||||
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) | Params<br>(M) |
|
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|
||||||
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|
||||||
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
|
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
|
||||||
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
|
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
|
||||||
|
|
|
@ -82,11 +82,11 @@ class GroupAttention(nn.Layer):
|
||||||
B, total_groups, self.ws**2, 3, self.num_heads, C // self.num_heads
|
B, total_groups, self.ws**2, 3, self.num_heads, C // self.num_heads
|
||||||
]).transpose([3, 0, 1, 4, 2, 5])
|
]).transpose([3, 0, 1, 4, 2, 5])
|
||||||
q, k, v = qkv[0], qkv[1], qkv[2]
|
q, k, v = qkv[0], qkv[1], qkv[2]
|
||||||
attn = (q @ k.transpose([0, 1, 2, 4, 3])) * self.scale
|
attn = paddle.matmul(q, k.transpose([0, 1, 2, 4, 3])) * self.scale
|
||||||
|
|
||||||
attn = nn.Softmax(axis=-1)(attn)
|
attn = nn.Softmax(axis=-1)(attn)
|
||||||
attn = self.attn_drop(attn)
|
attn = self.attn_drop(attn)
|
||||||
attn = (attn @ v).transpose([0, 1, 3, 2, 4]).reshape(
|
attn = paddle.matmul(attn, v).transpose([0, 1, 3, 2, 4]).reshape(
|
||||||
[B, h_group, w_group, self.ws, self.ws, C])
|
[B, h_group, w_group, self.ws, self.ws, C])
|
||||||
|
|
||||||
x = attn.transpose([0, 1, 3, 2, 4, 5]).reshape([B, N, C])
|
x = attn.transpose([0, 1, 3, 2, 4, 5]).reshape([B, N, C])
|
||||||
|
@ -147,11 +147,11 @@ class Attention(nn.Layer):
|
||||||
[2, 0, 3, 1, 4])
|
[2, 0, 3, 1, 4])
|
||||||
k, v = kv[0], kv[1]
|
k, v = kv[0], kv[1]
|
||||||
|
|
||||||
attn = (q @ k.transpose([0, 1, 3, 2])) * self.scale
|
attn = paddle.matmul(q, k.transpose([0, 1, 3, 2])) * self.scale
|
||||||
attn = nn.Softmax(axis=-1)(attn)
|
attn = nn.Softmax(axis=-1)(attn)
|
||||||
attn = self.attn_drop(attn)
|
attn = self.attn_drop(attn)
|
||||||
|
|
||||||
x = (attn @ v).transpose([0, 2, 1, 3]).reshape([B, N, C])
|
x = paddle.matmul(attn, v).transpose([0, 2, 1, 3]).reshape([B, N, C])
|
||||||
x = self.proj(x)
|
x = self.proj(x)
|
||||||
x = self.proj_drop(x)
|
x = self.proj_drop(x)
|
||||||
return x
|
return x
|
||||||
|
|
Loading…
Reference in New Issue