Supplementary algo tables (#157)

2025-06-03 14:49:00 +08:00 · 2022-08-18 10:26:34 +08:00 · 2022-08-18 10:26:34 +08:00 · a75d41f606
commit a75d41f606
parent a11f200ec3
3 changed files with 250 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -43,6 +43,13 @@ EasyCV is an all-in-one computer vision toolbox based on PyTorch, mainly focus o

  EasyCV support multi-gpu and multi worker training. EasyCV use [DALI](https://github.com/NVIDIA/DALI) to accelerate data io and preprocessing process, and use [TorchAccelerator](https://github.com/alibaba/EasyCV/tree/master/docs/source/tutorials/torchacc.md) and fp16 to accelerate training process. For inference optimization, EasyCV export model using jit script, which can be optimized by [PAI-Blade](https://help.aliyun.com/document_detail/205134.html)

+## Technical Articles
+
+We have a series of technical articles on the functionalities of EasyCV.
+* [EasyCV开源｜开箱即用的视觉自监督+Transformer算法库](https://zhuanlan.zhihu.com/p/505219993)
+* [MAE自监督算法介绍和基于EasyCV的复现](https://zhuanlan.zhihu.com/p/515859470)
+* [基于EasyCV复现ViTDet：单层特征超越FPN](https://zhuanlan.zhihu.com/p/528733299)
+* [基于EasyCV复现DETR和DAB-DETR，Object Query的正确打开方式](https://zhuanlan.zhihu.com/p/543129581)

 ## Installation

@ -58,7 +65,7 @@ Please refer to [quick_start.md](docs/source/quick_start.md) for quick start. We
 * [object detection with yolox](docs/source/tutorials/yolox.md)
 * [model compression with yolox](docs/source/tutorials/compression.md)
 * [metric learning](docs/source/tutorials/metric_learning.md)
-* [torchacc](https://github.com/alibaba/EasyCV/blob/master/docs/source/tutorials/torchacc.md)
+* [torchacc](docs/source/tutorials/torchacc.md)

 notebook
 * [self-supervised learning](docs/source/tutorials/EasyCV图像自监督训练-MAE.ipynb)
@ -69,6 +76,107 @@ notebook

 ## Model Zoo

+<div align="center">
+  <b>Architectures</b>
+</div>
+<table align="center">
+  <tbody>
+    <tr align="center">
+      <td>
+        <b>Self-Supervised Learning</b>
+      </td>
+      <td>
+        <b>Image Classification</b>
+      </td>
+      <td>
+        <b>Object Detection</b>
+      </td>
+      <td>
+        <b>Segmentation</b>
+      </td>
+    </tr>
+    <tr valign="top">
+      <td>
+        <ul>
+            <li><a href="configs/selfsup/byol">BYOL (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/dino">DINO (ICCV'2021)</a></li>
+            <li><a href="configs/selfsup/mixco">MiXCo (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/moby">MoBY (ArXiv'2021)</a></li>
+            <li><a href="configs/selfsup/mocov2">MoCov2 (ArXiv'2020)</a></li>
+            <li><a href="configs/selfsup/simclr">SimCLR (ICML'2020)</a></li>
+            <li><a href="configs/selfsup/swav">SwAV (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/mae">MAE (CVPR'2022)</a></li>
+            <li><a href="configs/selfsup/fast_convmae">FastConvMAE (ArXiv'2022)</a></li>
+      </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="configs/classification/imagenet/resnet">ResNet (CVPR'2016)</a></li>
+          <li><a href="configs/classification/imagenet/resnext">ResNeXt (CVPR'2017)</a></li>
+          <li><a href="configs/classification/imagenet/hrnet">HRNet (CVPR'2019)</a></li>
+          <li><a href="configs/classification/imagenet/vit">ViT (ICLR'2021)</a></li>
+          <li><a href="configs/classification/imagenet/swint">SwinT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/efficientformer">EfficientFormer (ArXiv'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/deit">DeiT (ICML'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/xcit">XCiT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/tnt">TNT (NeurIPS'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convit">ConViT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/cait">CaiT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/levit">LeViT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convnext">ConvNeXt (CVPR'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/resmlp">ResMLP (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/coat">CoaT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convmixer">ConvMixer (ICLR'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/mlp-mixer">MLP-Mixer (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/nest">NesT (AAAI'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/pit">PiT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/twins">Twins (NeurIPS'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/shuffle_transformer">Shuffle Transformer (ArXiv'2021)</a></li>
+        </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="configs/detection/fcos">FCOS (ICCV'2019)</a></li>
+          <li><a href="configs/detection/yolox">YOLOX (ArXiv'2021)</a></li>
+          <li><a href="configs/detection/detr">DETR (ECCV'2020)</a></li>
+          <li><a href="configs/detection/dab_detr">DAB-DETR (ICLR'2022)</a></li>
+          <li><a href="configs/detection/dab_detr">DN-DETR (CVPR'2022)</a></li>
+        </ul>
+      </td>
+      <td>
+        </ul>
+          <li><b>Instance Segmentation</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/detection/mask_rcnn">Mask R-CNN (ICCV'2017)</a></li>
+          <li><a href="configs/detection/vitdet">ViTDet (ArXiv'2022)</a></li>
+          <li><a href="configs/segmentation/mask2former">Mask2Former (CVPR'2022)</a></li>
+        </ul>
+        </ul>
+        </ul>
+          <li><b>Sementic Segmentation</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/segmentation/fcn">FCN (CVPR'2015)</a></li>
+          <li><a href="configs/segmentation/upernet">UperNet (ECCV'2018)</a></li>
+        </ul>
+        </ul>
+        </ul>
+          <li><b>Panoptic Segmentation</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/segmentation/mask2former">Mask2Former (CVPR'2022)</a></li>
+        </ul>
+        </ul>
+      </ul>
+      </td>
+    </tr>
+</td>
+    </tr>
+  </tbody>
+</table>
+
+
 Please refer to the following model zoo for more details.

 - [self-supervised learning model zoo](docs/source/model_zoo_ssl.md)
@ -80,7 +188,7 @@ Please refer to the following model zoo for more details.

 EasyCV have collected dataset info for different senarios, making it easy for users to fintune or evaluate models in EasyCV modelzoo.

-Please refer to [data_hub.md](https://github.com/alibaba/EasyCV/blob/master/docs/source/data_hub.md).
+Please refer to [data_hub.md](docs/source/data_hub.md).

 ## ChangeLog

@ -89,7 +197,7 @@ Please refer to [data_hub.md](https://github.com/alibaba/EasyCV/blob/master/docs
    * Classification support EfficientFormer algorithm
    * Detection support FCOS、DETR、DAB-DETR and DN-DETR algorithm
    * Segmentation support UperNet algorithm
-    * Support use [torchacc](https://github.com/alibaba/EasyCV/blob/master/docs/source/tutorials/torchacc.md) to speed up training
+    * Support use [torchacc](docs/source/tutorials/torchacc.md) to speed up training
    * Support use analyze tools

 * 23/06/2022 EasyCV v0.4.0 was released.
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@ -40,9 +40,15 @@ EasyCV是一个涵盖多个领域的基于Pytorch的计算机视觉工具箱，

 - **高性能**

-  EasyCV支持多机多卡训练，同时支持[TorchAccelerator](https://github.com/alibaba/EasyCV/tree/master/docs/source/tutorials/torchacc.md)和fp16进行训练加速。在数据读取和预处理方面，EasyCV使用[DALI](https://github.com/NVIDIA/DALI)进行加速。对于模型推理优化，EasyCV支持使用jit script导出模型，使用[PAI-Blade](https://help.aliyun.com/document_detail/205134.html)进行模型优化。
+  EasyCV支持多机多卡训练，同时支持[TorchAccelerator](docs/source/tutorials/torchacc.md)和fp16进行训练加速。在数据读取和预处理方面，EasyCV使用[DALI](https://github.com/NVIDIA/DALI)进行加速。对于模型推理优化，EasyCV支持使用jit script导出模型，使用[PAI-Blade](https://help.aliyun.com/document_detail/205134.html)进行模型优化。

+## 技术文章

+我们有一系列关于EasyCV功能的技术文章。
+* [EasyCV开源｜开箱即用的视觉自监督+Transformer算法库](https://zhuanlan.zhihu.com/p/505219993)
+* [MAE自监督算法介绍和基于EasyCV的复现](https://zhuanlan.zhihu.com/p/515859470)
+* [基于EasyCV复现ViTDet：单层特征超越FPN](https://zhuanlan.zhihu.com/p/528733299)
+* [基于EasyCV复现DETR和DAB-DETR，Object Query的正确打开方式](https://zhuanlan.zhihu.com/p/543129581)

 ## 安装

@ -57,10 +63,110 @@ EasyCV是一个涵盖多个领域的基于Pytorch的计算机视觉工具箱，
 * [图像分类教程](docs/source/tutorials/cls.md)
 * [使用YOLOX进行物体检测教程](docs/source/tutorials/yolox.md)
 * [YOLOX模型压缩教程](docs/source/tutorials/compression.md)
-
+* [torchacc](docs/source/tutorials/torchacc.md)

 ## 模型库

+<div align="center">
+  <b>模型</b>
+</div>
+<table align="center">
+  <tbody>
+    <tr align="center">
+      <td>
+        <b>自监督学习</b>
+      </td>
+      <td>
+        <b>图像分类</b>
+      </td>
+      <td>
+        <b>目标检测</b>
+      </td>
+      <td>
+        <b>分割</b>
+      </td>
+    </tr>
+    <tr valign="top">
+      <td>
+        <ul>
+            <li><a href="configs/selfsup/byol">BYOL (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/dino">DINO (ICCV'2021)</a></li>
+            <li><a href="configs/selfsup/mixco">MiXCo (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/moby">MoBY (ArXiv'2021)</a></li>
+            <li><a href="configs/selfsup/mocov2">MoCov2 (ArXiv'2020)</a></li>
+            <li><a href="configs/selfsup/simclr">SimCLR (ICML'2020)</a></li>
+            <li><a href="configs/selfsup/swav">SwAV (NeurIPS'2020)</a></li>
+            <li><a href="configs/selfsup/mae">MAE (CVPR'2022)</a></li>
+            <li><a href="configs/selfsup/fast_convmae">FastConvMAE (ArXiv'2022)</a></li>
+      </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="configs/classification/imagenet/resnet">ResNet (CVPR'2016)</a></li>
+          <li><a href="configs/classification/imagenet/resnext">ResNeXt (CVPR'2017)</a></li>
+          <li><a href="configs/classification/imagenet/hrnet">HRNet (CVPR'2019)</a></li>
+          <li><a href="configs/classification/imagenet/vit">ViT (ICLR'2021)</a></li>
+          <li><a href="configs/classification/imagenet/swint">SwinT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/efficientformer">EfficientFormer (ArXiv'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/deit">DeiT (ICML'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/xcit">XCiT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/tnt">TNT (NeurIPS'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convit">ConViT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/cait">CaiT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/levit">LeViT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convnext">ConvNeXt (CVPR'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/resmlp">ResMLP (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/coat">CoaT (ICCV'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/convmixer">ConvMixer (ICLR'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/mlp-mixer">MLP-Mixer (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/nest">NesT (AAAI'2022)</a></li>
+          <li><a href="configs/classification/imagenet/timm/pit">PiT (ArXiv'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/twins">Twins (NeurIPS'2021)</a></li>
+          <li><a href="configs/classification/imagenet/timm/shuffle_transformer">Shuffle Transformer (ArXiv'2021)</a></li>
+        </ul>
+      </td>
+      <td>
+        <ul>
+          <li><a href="configs/detection/fcos">FCOS (ICCV'2019)</a></li>
+          <li><a href="configs/detection/yolox">YOLOX (ArXiv'2021)</a></li>
+          <li><a href="configs/detection/detr">DETR (ECCV'2020)</a></li>
+          <li><a href="configs/detection/dab_detr">DAB-DETR (ICLR'2022)</a></li>
+          <li><a href="configs/detection/dab_detr">DN-DETR (CVPR'2022)</a></li>
+        </ul>
+      </td>
+      <td>
+        </ul>
+          <li><b>实例分割</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/detection/mask_rcnn">Mask R-CNN (ICCV'2017)</a></li>
+          <li><a href="configs/detection/vitdet">ViTDet (ArXiv'2022)</a></li>
+          <li><a href="configs/segmentation/mask2former">Mask2Former (CVPR'2022)</a></li>
+        </ul>
+        </ul>
+        </ul>
+          <li><b>语义分割</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/segmentation/fcn">FCN (CVPR'2015)</a></li>
+          <li><a href="configs/segmentation/upernet">UperNet (ECCV'2018)</a></li>
+        </ul>
+        </ul>
+        </ul>
+          <li><b>全景分割</b></li>
+        <ul>
+        <ul>
+          <li><a href="configs/segmentation/mask2former">Mask2Former (CVPR'2022)</a></li>
+        </ul>
+        </ul>
+      </ul>
+      </td>
+    </tr>
+</td>
+    </tr>
+  </tbody>
+</table>
+
 不同领域的模型仓库和benchmark指标如下

 - [自监督模型库](docs/source/model_zoo_ssl.md)
@ -75,7 +181,7 @@ EasyCV是一个涵盖多个领域的基于Pytorch的计算机视觉工具箱，
    * 图像分类增加EfficientFormer
    * 目标检测增加FCOS、DETR、DAB-DETR和DN-DETR算法
    * 语义分割增加了UperNet算法
-    * 支持使用[torchacc](https://github.com/alibaba/EasyCV/blob/master/docs/source/tutorials/torchacc.md)加快训练速度
+    * 支持使用[torchacc](docs/source/tutorials/torchacc.md)加快训练速度
    * 增加模型分析工具

 * 23/06/2022 EasyCV v0.4.0 版本发布。
--- a/configs/detection/fcos/README.md
+++ b/configs/detection/fcos/README.md
@ -0,0 +1,30 @@
+# FCOS
+
+> [FCOS: Fully Convolutional One-Stage Object Detection](https://arxiv.org/abs/1904.01355)
+
+<!-- [ALGORITHM] -->
+
+## Abstract
+
+We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free. By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks.
+
+<div align=center>
+<img src="https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/algo_images/detection/fcos.png"/>
+</div>
+
+## Results and Models
+
+| Algorithm  | Config                                                       | Params<br/>(backbone/total)                            | inference time(V100)<br/>(ms/img)                      | mAP<sup>val<br/><sub>0.5:0.95</sub> | AP<sup>val<br/><sub>50</sub> | Download                                                     |
+| ---------- | ------------------------------------------------------------ | ------------------------ | --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| FCOS-r50    | [fcos-r50](https://github.com/alibaba/EasyCV/tree/master/configs/detection/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_1x_coco.py) | 23M/32M | 85.8ms | 38.58                   | 57.18          | [model](https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/fcos/epoch_12.pth) - [log](https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/fcos/20220621_121315.log.json) |
+
+## Citation
+
+```latex
+@article{tian2019fcos,
+  title={FCOS: Fully Convolutional One-Stage Object Detection},
+  author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+  journal={arXiv preprint arXiv:1904.01355},
+  year={2019}
+}
+```