sennnnn c27ef91942 Adjust vision transformer backbone architectures (#524)
* Adjust vision transformer backbone architectures;

* Add DropPath, trunc_normal_ for VisionTransformer implementation;

* Add class token buring intermediate period and remove it during final period;

* Fix some parameters loss bug;

* * Store intermediate token features and impose no processes on them;

* Remove class token and reshape entire token feature from NLC to NCHW;

* Fix some doc error

* Add a arg for VisionTransformer backbone to control if input class token into transformer;

* Add stochastic depth decay rule for DropPath;

* * Fix output bug when input_cls_token=False;

* Add related unit test;

* * Add arg: out_indices to control model output;

* Add unit test for DropPath;

* Apply suggestions from code review

Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>
2021-04-30 10:37:47 -07:00

32 lines
1015 B
Python

"""Modified from https://github.com/rwightman/pytorch-image-
models/blob/master/timm/models/layers/drop.py."""
import torch
from torch import nn
class DropPath(nn.Module):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of
residual blocks).
Args:
drop_prob (float): Drop rate for paths of model. Dropout rate has
to be between 0 and 1. Default: 0.
"""
def __init__(self, drop_prob=0.):
super(DropPath, self).__init__()
self.drop_prob = drop_prob
self.keep_prob = 1 - drop_prob
def forward(self, x):
if self.drop_prob == 0. or not self.training:
return x
shape = (x.shape[0], ) + (1, ) * (
x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
random_tensor = self.keep_prob + torch.rand(
shape, dtype=x.dtype, device=x.device)
random_tensor.floor_() # binarize
output = x.div(self.keep_prob) * random_tensor
return output