[Fix] Remove the inplace operation in uper_head and fpn_neck (#1103)

* [Fix] Remove the inplace operation in  uper_head

* remove the inplace operation in fpn neck

* fix conflict

* increase the coverage
pull/1801/head
Rockey 2021-12-09 12:12:31 +08:00 committed by GitHub
parent e06cbcd8b1
commit 0ad0303ebc
4 changed files with 15 additions and 6 deletions

View File

@ -22,7 +22,6 @@ class MixFFN(BaseModule):
The differences between MixFFN & FFN: The differences between MixFFN & FFN:
1. Use 1X1 Conv to replace Linear layer. 1. Use 1X1 Conv to replace Linear layer.
2. Introduce 3X3 Conv to encode positional information. 2. Introduce 3X3 Conv to encode positional information.
Args: Args:
embed_dims (int): The feature dimension. Same as embed_dims (int): The feature dimension. Same as
`MultiheadAttention`. Defaults: 256. `MultiheadAttention`. Defaults: 256.
@ -94,7 +93,6 @@ class EfficientMultiheadAttention(MultiheadAttention):
This module is modified from MultiheadAttention which is a module from This module is modified from MultiheadAttention which is a module from
mmcv.cnn.bricks.transformer. mmcv.cnn.bricks.transformer.
Args: Args:
embed_dims (int): The embedding dimension. embed_dims (int): The embedding dimension.
num_heads (int): Parallel attention heads. num_heads (int): Parallel attention heads.
@ -291,7 +289,6 @@ class MixVisionTransformer(BaseModule):
This backbone is the implementation of `SegFormer: Simple and This backbone is the implementation of `SegFormer: Simple and
Efficient Design for Semantic Segmentation with Efficient Design for Semantic Segmentation with
Transformers <https://arxiv.org/abs/2105.15203>`_. Transformers <https://arxiv.org/abs/2105.15203>`_.
Args: Args:
in_channels (int): Number of input channels. Default: 3. in_channels (int): Number of input channels. Default: 3.
embed_dims (int): Embedding dimension. Default: 768. embed_dims (int): Embedding dimension. Default: 768.

View File

@ -101,7 +101,7 @@ class UPerHead(BaseDecodeHead):
used_backbone_levels = len(laterals) used_backbone_levels = len(laterals)
for i in range(used_backbone_levels - 1, 0, -1): for i in range(used_backbone_levels - 1, 0, -1):
prev_shape = laterals[i - 1].shape[2:] prev_shape = laterals[i - 1].shape[2:]
laterals[i - 1] += resize( laterals[i - 1] = laterals[i - 1] + resize(
laterals[i], laterals[i],
size=prev_shape, size=prev_shape,
mode='bilinear', mode='bilinear',

View File

@ -175,10 +175,11 @@ class FPN(BaseModule):
# In some cases, fixing `scale factor` (e.g. 2) is preferred, but # In some cases, fixing `scale factor` (e.g. 2) is preferred, but
# it cannot co-exist with `size` in `F.interpolate`. # it cannot co-exist with `size` in `F.interpolate`.
if 'scale_factor' in self.upsample_cfg: if 'scale_factor' in self.upsample_cfg:
laterals[i - 1] += resize(laterals[i], **self.upsample_cfg) laterals[i - 1] = laterals[i - 1] + resize(
laterals[i], **self.upsample_cfg)
else: else:
prev_shape = laterals[i - 1].shape[2:] prev_shape = laterals[i - 1].shape[2:]
laterals[i - 1] += resize( laterals[i - 1] = laterals[i - 1] + resize(
laterals[i], size=prev_shape, **self.upsample_cfg) laterals[i], size=prev_shape, **self.upsample_cfg)
# build outputs # build outputs

View File

@ -17,3 +17,14 @@ def test_fpn():
assert outputs[1].shape == torch.Size([1, 64, 28, 28]) assert outputs[1].shape == torch.Size([1, 64, 28, 28])
assert outputs[2].shape == torch.Size([1, 64, 14, 14]) assert outputs[2].shape == torch.Size([1, 64, 14, 14])
assert outputs[3].shape == torch.Size([1, 64, 7, 7]) assert outputs[3].shape == torch.Size([1, 64, 7, 7])
fpn = FPN(
in_channels,
64,
len(in_channels),
upsample_cfg=dict(mode='nearest', scale_factor=2.0))
outputs = fpn(inputs)
assert outputs[0].shape == torch.Size([1, 64, 56, 56])
assert outputs[1].shape == torch.Size([1, 64, 28, 28])
assert outputs[2].shape == torch.Size([1, 64, 14, 14])
assert outputs[3].shape == torch.Size([1, 64, 7, 7])