[Docs] Add NPU support page (#1481)

* add npu docs * fix lint
2023-04-14 13:58:10 +08:00 · 2023-04-14 13:58:10 +08:00 · 02571fe4b8
parent 645e2b4ed4
commit 02571fe4b8
4 changed files with 100 additions and 0 deletions
--- a/docs/en/device/npu.md
+++ b/docs/en/device/npu.md
@ -0,0 +1,47 @@
+# NPU (HUAWEI Ascend)
+
+## Usage
+
+### General Usage
+
+Please refer to the [building documentation of MMCV](https://mmcv.readthedocs.io/en/latest/get_started/build.html#build-mmcv-full-on-ascend-npu-machine) to install MMCV and [MMEngine](https://mmengine.readthedocs.io/en/latest/get_started/installation.html#build-from-source) on NPU devices.
+
+Here we use 8 NPUs on your computer to train the model with the following command:
+
+```shell
+bash ./tools/dist_train.sh configs/resnet/resnet50_8xb32_in1k.py 8
+```
+
+Also, you can use only one NPU to train the model with the following command:
+
+```shell
+python ./tools/train.py configs/resnet/resnet50_8xb32_in1k.py
+```
+
+## Models Results
+
+|                            Model                            | Top-1 (%) | Top-5 (%) |                            Config                            |                            Download                             |
+| :---------------------------------------------------------: | :-------: | :-------: | :----------------------------------------------------------: | :-------------------------------------------------------------: |
+| [ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) |   76.40   |   93.21   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnet50_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnet50_8xb32_in1k.log) |
+| [ResNetXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/README.md) |   77.48   |   93.75   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnext50-32x4d_8xb32_in1k.log) |
+| [HRNet-W18](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/README.md) |   77.06   |   93.57   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/hrnet/hrnet-w18_4xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/hrnet-w18_4xb32_in1k.log) |
+| [ResNetV1D-152](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) |   79.41   |   94.48   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnetv1d152_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnetv1d152_8xb32_in1k.log) |
+| [SE-ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/README.md) |   77.65   |   93.74   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/seresnet50_8xb32_in1k.py) | [model](<>) \|[log](https://download.openmmlab.com/mmclassification/v1/device/npu/seresnet50_8xb32_in1k.log) |
+| [ShuffleNetV2 1.0x](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/README.md) |   69.52   |   88.79   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/shufflenet-v2-1x_16xb64_in1k.log) |
+| [MobileNetV2](https://github.com/open-mmlab/mmclassification/tree/1.x/configs/mobilenet_v2) |   71.74   |   90.28   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v2_8xb32_in1k.log) |
+| [MobileNetV3-Small](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/README.md) |   67.09   |   87.17   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/mobilenet-v3-small_8xb128_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v3-small.log) |
+| [\*CSPResNeXt50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/README.md) |   77.25   |   93.46   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/cspresnext50_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/cspresnext50_8xb32_in1k.log) |
+| [\*EfficientNet-B4](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/README.md) |   75.73   |   92.91   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/efficientnet-b4_8xb32_in1k.py) | [model](<>) \|[log](https://download.openmmlab.com/mmclassification/v1/device/npu/efficientnet-b4_8xb32_in1k.log) |
+| [\*\*DenseNet121](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/README.md) |   72.53   |   90.85   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/densenet121_4xb256_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/densenet121_4xb256_in1k.log) |
+
+**Notes:**
+
+- If not specially marked, the results are almost same between results on the NPU and results on the GPU with FP32.
+- (\*) The training results of these models are lower than the results on the readme in the corresponding model, mainly
+  because the results on the readme are directly the weight of the timm of the eval, and the results on this side are
+  retrained according to the config with mmcls. The results of the config training on the GPU are consistent with the
+  results of the NPU.
+- (\*\*) The accuracy of this model is slightly lower because config is a 4-card config, we use 8 cards to run, and users
+  can adjust hyperparameters to get the best accuracy results.
+
+**All above models are provided by Huawei Ascend group.**
--- a/docs/en/index.rst
+++ b/docs/en/index.rst
@ -141,6 +141,12 @@ We always welcome *PRs* and *Issues* for the betterment of MMPretrain.
   notes/pretrain_custom_dataset.md
   notes/finetune_custom_dataset.md

+.. toctree::
+   :maxdepth: 1
+   :caption: Device Support
+
+   device/npu.md
+
 Indices and tables
 ==================

--- a/docs/zh_CN/device/npu.md
+++ b/docs/zh_CN/device/npu.md
@ -0,0 +1,41 @@
+# NPU (华为昇腾)
+
+## 使用方法
+
+首先，请参考[链接](https://mmcv.readthedocs.io/zh_CN/latest/get_started/build.html#npu-mmcv-full)安装带有 NPU 支持的 MMCV 和[链接](https://mmengine.readthedocs.io/en/latest/get_started/installation.html#build-from-source)安装 MMEngine。
+
+使用如下命令，可以利用 8 个 NPU 在机器上训练模型（以 ResNet 为例）：
+
+```shell
+bash tools/dist_train.sh configs/cspnet/resnet50_8xb32_in1k.py 8
+```
+
+或者，使用如下命令，在一个 NPU 上训练模型（以 ResNet 为例）：
+
+```shell
+python tools/train.py configs/cspnet/resnet50_8xb32_in1k.py
+```
+
+## 经过验证的模型
+
+|                            Model                            | Top-1 (%) | Top-5 (%) |                            Config                            |                            Download                             |
+| :---------------------------------------------------------: | :-------: | :-------: | :----------------------------------------------------------: | :-------------------------------------------------------------: |
+| [ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) |   76.40   |   93.21   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnet50_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnet50_8xb32_in1k.log) |
+| [ResNetXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/README.md) |   77.48   |   93.75   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnext50-32x4d_8xb32_in1k.log) |
+| [HRNet-W18](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/README.md) |   77.06   |   93.57   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/hrnet/hrnet-w18_4xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/hrnet-w18_4xb32_in1k.log) |
+| [ResNetV1D-152](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/README.md) |   79.41   |   94.48   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/resnet/resnetv1d152_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/resnetv1d152_8xb32_in1k.log) |
+| [SE-ResNet-50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/README.md) |   77.65   |   93.74   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/seresnet/seresnet50_8xb32_in1k.py) | [model](<>) \|[log](https://download.openmmlab.com/mmclassification/v1/device/npu/seresnet50_8xb32_in1k.log) |
+| [ShuffleNetV2 1.0x](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/README.md) |   69.52   |   88.79   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/shufflenet-v2-1x_16xb64_in1k.log) |
+| [MobileNetV2](https://github.com/open-mmlab/mmclassification/tree/1.x/configs/mobilenet_v2) |   71.74   |   90.28   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v2_8xb32_in1k.log) |
+| [MobileNetV3-Small](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/README.md) |   67.09   |   87.17   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/mobilenet_v3/mobilenet-v3-small_8xb128_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/mobilenet-v3-small.log) |
+| [\*CSPResNeXt50](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/README.md) |   77.25   |   93.46   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/cspnet/cspresnext50_8xb32_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/cspresnext50_8xb32_in1k.log) |
+| [\*EfficientNet-B4](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/README.md) |   75.73   |  92.9100  | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/efficientnet/efficientnet-b4_8xb32_in1k.py) | [model](<>) \|[log](https://download.openmmlab.com/mmclassification/v1/device/npu/efficientnet-b4_8xb32_in1k.log) |
+| [\*\*DenseNet121](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/README.md) |   72.53   |   90.85   | [config](https://github.com/open-mmlab/mmclassification/blob/1.x/configs/densenet/densenet121_4xb256_in1k.py) | [model](<>) \| [log](https://download.openmmlab.com/mmclassification/v1/device/npu/densenet121_4xb256_in1k.log) |
+
+**注意:**
+
+- 如果没有特别标记，NPU 上的结果与使用 FP32 的 GPU 上的结果结果相同。
+- (\*) 这些模型的训练结果低于相应模型中自述文件上的结果，主要是因为自述文件上的结果直接是 timm 训练得出的权重，而这边的结果是根据 mmcls 的配置重新训练得到的结果。GPU 上的配置训练结果与 NPU 的结果相同。
+- (\*\*)这个模型的精度略低，因为 config 是 4 张卡的配置，我们使用 8 张卡来运行，用户可以调整超参数以获得最佳精度结果。
+
+**以上所有模型权重及训练日志均由华为昇腾团队提供**
--- a/docs/zh_CN/index.rst
+++ b/docs/zh_CN/index.rst
@ -127,6 +127,12 @@ MMPretrain 上手路线
   notes/pretrain_custom_dataset.md
   notes/finetune_custom_dataset.md

+.. toctree::
+   :maxdepth: 1
+   :caption: 设备支持
+
+   device/npu.md
+
 .. toctree::
   :caption: 切换语言