* [Refactor] Using mmcv bricks to refactor vit
* Follow the vit code structure from mmclassification
* Add MMCV install into CI system.
* Add to 'Install MMCV' CI item
* Add 'Install MMCV_CPU' and 'Install MMCV_GPU CI' items
* Fix & Add
1. Fix low code coverage of vit.py;
2. Remove HybirdEmbed;
3. Fix doc string of VisionTransformer;
* Add helpers unit test.
* Add converter to convert vit pretrain weights from timm style to mmcls style.
* Clean some rebundant code and refactor init
1. Use timm style init_weights;
2. Remove to_xtuple and trunc_norm_;
* Add comments for VisionTransformer.init_weights()
* Add arg: pretrain_style to choose timm or mmcls vit pretrain weights.
* Adjust vision transformer backbone architectures;
* Add DropPath, trunc_normal_ for VisionTransformer implementation;
* Add class token buring intermediate period and remove it during final period;
* Fix some parameters loss bug;
* * Store intermediate token features and impose no processes on them;
* Remove class token and reshape entire token feature from NLC to NCHW;
* Fix some doc error
* Add a arg for VisionTransformer backbone to control if input class token into transformer;
* Add stochastic depth decay rule for DropPath;
* * Fix output bug when input_cls_token=False;
* Add related unit test;
* * Add arg: out_indices to control model output;
* Add unit test for DropPath;
* Apply suggestions from code review
Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>