[Feature] Ascend backend (#747)

* add acl backend

* support dynamic batch size and dynamic image size

* add preliminary ascend backend

* support dtypes other than float

* support dynamic_dims in SDK

* fix dynamic batch size

* better error handling

* remove debug info

* [WIP] dynamic shape support

* fix static shape

* fix dynamic batch size

* add retinanet support

* fix dynamic image size

* fix dynamic image size

* fix dynamic dims

* fix dynamic dims

* simplify config files

* fix yolox support

* fix negative index

* support faster rcnn

* add seg config

* update benchmark

* fix onnx2ascend dynamic shape

* update docstring and benchmark

* add unit test, update documents

* fix wrapper

* fix ut

* fix for vit

* error handling

* context handling & multi-device support

* build with stub libraries

* add ci

* fix lint

* fix lint

* update doc ref

* fix typo

* down with `target_link_directories`

* setup python

* makedir

* fix ci

* fix ci

* remove verbose logs

* fix UBs

* export Error

* fix lint

* update checkenv

Co-authored-by: grimoire <yaoqian@sensetime.com>
pull/760/head
Li Zhang 2022-09-05 12:08:36 +08:00 committed by GitHub
parent 966d737a1b
commit 792c27b054
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
61 changed files with 2621 additions and 234 deletions

View File

@ -0,0 +1,2 @@
cann
CANN

View File

@ -0,0 +1,54 @@
name: backend-ascend
on:
push:
paths-ignore:
- "demo/**"
- "tools/**"
pull_request:
paths-ignore:
- "demo/**"
- "tools/**"
- "docs/**"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
build_sdk_demo:
runs-on: ubuntu-18.04
strategy:
matrix:
python-version: [3.7]
steps:
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Checkout repository
uses: actions/checkout@v3
with:
submodules: 'recursive'
- name: update
run: sudo apt update
- name: Install dependencies
run: |
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libxrender-dev libc++1-9 libc++abi1-9
sudo add-apt-repository ppa:ignaciovizzo/opencv3-nonfree
sudo apt install libopencv-dev
pkg-config --libs opencv
- name: Install Ascend Toolkit
run: |
mkdir -p $GITHUB_WORKSPACE/Ascend
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%205.1.RC2/Ascend-cann-toolkit_5.1.RC2_linux-x86_64.run
sh Ascend-cann-toolkit_5.1.RC2_linux-x86_64.run --install --install-path=$GITHUB_WORKSPACE/Ascend --quiet --chip=Ascend310 --blacklist=devtools
- name: Build SDK Demo with Ascend backend
run: |
mkdir -p build && pushd build
source $GITHUB_WORKSPACE/Ascend/ascend-toolkit/set_env.sh
export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/Ascend/ascend-toolkit/latest/runtime/lib64/stub:$LD_LIBRARY_PATH
cmake .. -DCMAKE_CXX_COMPILER=g++-7 -DMMDEPLOY_SHARED_LIBS=ON -DMMDEPLOY_BUILD_SDK=ON -DMMDEPLOY_BUILD_SDK_PYTHON_API=OFF -DMMDEPLOY_TARGET_DEVICES=cpu -DMMDEPLOY_BUILD_EXAMPLES=ON -DMMDEPLOY_TARGET_BACKENDS=acl -DMMDEPLOY_CODEBASES=all
make install -j4

3
.gitignore vendored
View File

@ -153,6 +153,9 @@ mmdeploy/backend/ncnn/onnx2ncnn
/mmdeploy-*
# ascend
fusion_result.json
# snpe
grpc-cpp-plugin
service/snpe/grpc_cpp_plugin

View File

@ -55,9 +55,9 @@ The currently supported codebases and models are as follows, and more will be in
Models can be exported and run in the following backends, and more will be compatible
| ONNX Runtime | TensorRT | ppl.nn | ncnn | OpenVINO | LibTorch | snpe | more |
| ------------ | -------- | ------ | ---- | -------- | -------- | ---- | ---------------------------------------------- |
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | [benchmark](docs/en/03-benchmark/benchmark.md) |
| ONNX Runtime | TensorRT | ppl.nn | ncnn | OpenVINO | LibTorch | snpe | Ascend | more |
| ------------ | -------- | ------ | ---- | -------- | -------- | ---- | ------ | ---------------------------------------------- |
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | [benchmark](docs/en/03-benchmark/benchmark.md) |
### Efficient and scalable C/C++ SDK Framework

View File

@ -53,9 +53,9 @@ MMDeploy 是 [OpenMMLab](https://openmmlab.com/) 模型部署工具箱,**为
### 支持多种推理后端
| ONNX Runtime | TensorRT | ppl.nn | ncnn | OpenVINO | LibTorch | snpe | more |
| ------------ | -------- | ------ | ---- | -------- | -------- | ---- | ------------------------------------------------- |
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | [benchmark](docs/zh_cn/03-benchmark/benchmark.md) |
| ONNX Runtime | TensorRT | ppl.nn | ncnn | OpenVINO | LibTorch | snpe | Ascend | more |
| ------------ | -------- | ------ | ---- | -------- | -------- | ---- | ------ | ------------------------------------------------- |
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | [benchmark](docs/zh_cn/03-benchmark/benchmark.md) |
### SDK 可高度定制化

View File

@ -0,0 +1 @@
backend_config = dict(type='ascend')

View File

@ -0,0 +1,9 @@
_base_ = ['./classification_dynamic.py', '../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[224, 224])
backend_config = dict(model_inputs=[
dict(
dynamic_batch_size=[1, 2, 4, 8],
input_shapes=dict(input=[-1, 3, 224, 224]))
])

View File

@ -0,0 +1,5 @@
_base_ = ['./classification_static.py', '../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[224, 224])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 224, 224]))])

View File

@ -0,0 +1,8 @@
_base_ = ['../_base_/base_dynamic.py', '../../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[1344, 800])
backend_config = dict(model_inputs=[
dict(
dynamic_image_size=[(800, 1344), (1344, 800)],
input_shapes=dict(input=[1, 3, -1, -1]))
])

View File

@ -0,0 +1,5 @@
_base_ = ['../_base_/base_static.py', '../../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[640, 640])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 640, 640]))])

View File

@ -0,0 +1,5 @@
_base_ = ['../_base_/base_static.py', '../../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[1344, 800])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 800, 1344]))])

View File

@ -0,0 +1,8 @@
_base_ = ['./text-detection_dynamic.py', '../../_base_/backends/ascend.py']
onnx_config = dict(input_shape=None)
backend_config = dict(model_inputs=[
dict(
input_shapes=dict(input=[-1, 3, -1, -1]),
dynamic_dims=[(1, 640, 640), (4, 640, 640), (1, 1280, 1280)])
])

View File

@ -0,0 +1,5 @@
_base_ = ['./text-detection_static.py', '../../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[640, 640])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 640, 640]))])

View File

@ -0,0 +1,5 @@
_base_ = ['./segmentation_static.py', '../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[2048, 1024])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 1024, 2048]))])

View File

@ -0,0 +1,5 @@
_base_ = ['./segmentation_static.py', '../_base_/backends/ascend.py']
onnx_config = dict(input_shape=[1024, 512])
backend_config = dict(
model_inputs=[dict(input_shapes=dict(input=[1, 3, 512, 1024]))])

View File

@ -4,6 +4,7 @@
#include "mmdeploy/core/registry.h"
#include "mmdeploy/core/utils/device_utils.h"
#include "mmdeploy/core/utils/formatter.h"
#include "mmdeploy/experimental/module_adapter.h"
using namespace std;

View File

@ -68,7 +68,7 @@ class Device {
constexpr explicit Device(int platform_id, int device_id = 0)
: platform_id_(platform_id), device_id_(device_id) {}
MMDEPLOY_API explicit Device(const char *platform_name, int device_id = 0);
MMDEPLOY_API explicit Device(const char* platform_name, int device_id = 0);
constexpr int device_id() const noexcept { return device_id_; }
@ -78,11 +78,11 @@ class Device {
constexpr bool is_device() const noexcept { return platform_id() > 0; }
constexpr bool operator==(const Device &other) const noexcept {
constexpr bool operator==(const Device& other) const noexcept {
return platform_id_ == other.platform_id_ && device_id_ == other.device_id_;
}
constexpr bool operator!=(const Device &other) const noexcept { return !(*this == other); }
constexpr bool operator!=(const Device& other) const noexcept { return !(*this == other); }
constexpr explicit operator bool() const noexcept { return platform_id_ >= 0 && device_id_ >= 0; }
@ -104,7 +104,7 @@ enum class MemcpyKind : int { HtoD, DtoH, DtoD };
class MMDEPLOY_API Platform {
public:
// throws if not found
explicit Platform(const char *platform_name);
explicit Platform(const char* platform_name);
// throws if not found
explicit Platform(int platform_id);
@ -113,11 +113,11 @@ class MMDEPLOY_API Platform {
int GetPlatformId() const;
// "" if invalid
const char *GetPlatformName() const;
const char* GetPlatformName() const;
bool operator==(const Platform &other) { return impl_ == other.impl_; }
bool operator==(const Platform& other) { return impl_ == other.impl_; }
bool operator!=(const Platform &other) { return !(*this == other); }
bool operator!=(const Platform& other) { return !(*this == other); }
explicit operator bool() const noexcept { return static_cast<bool>(impl_); }
@ -132,7 +132,7 @@ class MMDEPLOY_API Platform {
Platform GetPlatform(int platform_id);
Platform GetPlatform(const char *platform_name);
Platform GetPlatform(const char* platform_name);
class MMDEPLOY_API Stream {
public:
@ -140,7 +140,7 @@ class MMDEPLOY_API Stream {
explicit Stream(Device device, uint64_t flags = 0);
explicit Stream(Device device, void *native, uint64_t flags = 0);
explicit Stream(Device device, void* native, uint64_t flags = 0);
explicit Stream(Device device, std::shared_ptr<void> native, uint64_t flags = 0);
@ -150,25 +150,25 @@ class MMDEPLOY_API Stream {
Result<void> Wait();
Result<void> DependsOn(Event &event);
Result<void> DependsOn(Event& event);
Result<void> Submit(Kernel &kernel);
Result<void> Submit(Kernel& kernel);
void *GetNative(ErrorCode *ec = nullptr);
void* GetNative(ErrorCode* ec = nullptr);
Result<void> Copy(const Buffer &src, Buffer &dst, size_t size = -1, size_t src_offset = 0,
Result<void> Copy(const Buffer& src, Buffer& dst, size_t size = -1, size_t src_offset = 0,
size_t dst_offset = 0);
Result<void> Copy(const void *host_ptr, Buffer &dst, size_t size = -1, size_t dst_offset = 0);
Result<void> Copy(const void* host_ptr, Buffer& dst, size_t size = -1, size_t dst_offset = 0);
Result<void> Copy(const Buffer &src, void *host_ptr, size_t size = -1, size_t src_offset = 0);
Result<void> Copy(const Buffer& src, void* host_ptr, size_t size = -1, size_t src_offset = 0);
Result<void> Fill(const Buffer &dst, void *pattern, size_t pattern_size, size_t size = -1,
Result<void> Fill(const Buffer& dst, void* pattern, size_t pattern_size, size_t size = -1,
size_t offset = 0);
bool operator==(const Stream &other) const { return impl_ == other.impl_; }
bool operator==(const Stream& other) const { return impl_ == other.impl_; }
bool operator!=(const Stream &other) const { return !(*this == other); }
bool operator!=(const Stream& other) const { return !(*this == other); }
explicit operator bool() const noexcept { return static_cast<bool>(impl_); }
@ -184,7 +184,7 @@ class MMDEPLOY_API Stream {
};
template <typename T>
T GetNative(Stream &stream, ErrorCode *ec = nullptr) {
T GetNative(Stream& stream, ErrorCode* ec = nullptr) {
return reinterpret_cast<T>(stream.GetNative(ec));
}
@ -194,7 +194,7 @@ class MMDEPLOY_API Event {
explicit Event(Device device, uint64_t flags = 0);
explicit Event(Device device, void *native, uint64_t flags = 0);
explicit Event(Device device, void* native, uint64_t flags = 0);
explicit Event(Device device, std::shared_ptr<void> native, uint64_t flags = 0);
@ -204,13 +204,13 @@ class MMDEPLOY_API Event {
Result<void> Wait();
Result<void> Record(Stream &stream);
Result<void> Record(Stream& stream);
void *GetNative(ErrorCode *ec = nullptr);
void* GetNative(ErrorCode* ec = nullptr);
bool operator==(const Event &other) const { return impl_ == other.impl_; }
bool operator==(const Event& other) const { return impl_ == other.impl_; }
bool operator!=(const Event &other) const { return !(*this == other); }
bool operator!=(const Event& other) const { return !(*this == other); }
explicit operator bool() const noexcept { return static_cast<bool>(impl_); }
@ -223,7 +223,7 @@ class MMDEPLOY_API Event {
};
template <typename T>
T GetNative(Event &event, ErrorCode *ec = nullptr) {
T GetNative(Event& event, ErrorCode* ec = nullptr) {
return reinterpret_cast<T>(event.GetNative(ec));
}
@ -234,7 +234,7 @@ class MMDEPLOY_API Kernel {
Device GetDevice() const;
void *GetNative(ErrorCode *ec = nullptr);
void* GetNative(ErrorCode* ec = nullptr);
explicit operator bool() const noexcept { return static_cast<bool>(impl_); }
@ -243,7 +243,7 @@ class MMDEPLOY_API Kernel {
};
template <typename T>
T GetNative(Kernel &kernel, ErrorCode *ec = nullptr) {
T GetNative(Kernel& kernel, ErrorCode* ec = nullptr) {
return reinterpret_cast<T>(kernel.GetNative(ec));
}
@ -269,25 +269,25 @@ class MMDEPLOY_API Buffer {
Buffer(Device device, size_t size, Allocator allocator, size_t alignment = 1, uint64_t flags = 0);
Buffer(Device device, size_t size, void *native, uint64_t flags = 0);
Buffer(Device device, size_t size, void* native, uint64_t flags = 0);
Buffer(Device device, size_t size, std::shared_ptr<void> native, uint64_t flags = 0);
// create sub-buffer
Buffer(Buffer &buffer, size_t offset, size_t size, uint64_t flags = 0);
Buffer(Buffer& buffer, size_t offset, size_t size, uint64_t flags = 0);
size_t GetSize(ErrorCode *ec = nullptr) const;
size_t GetSize(ErrorCode* ec = nullptr) const;
// bool IsSubBuffer(ErrorCode *ec = nullptr);
// bool IsSubBuffer(ErrorCode* ec = nullptr);
void *GetNative(ErrorCode *ec = nullptr) const;
void* GetNative(ErrorCode* ec = nullptr) const;
Device GetDevice() const;
Allocator GetAllocator() const;
bool operator==(const Buffer &other) const { return impl_ == other.impl_; }
bool operator==(const Buffer& other) const { return impl_ == other.impl_; }
bool operator!=(const Buffer &other) const { return !(*this == other); }
bool operator!=(const Buffer& other) const { return !(*this == other); }
explicit operator bool() const noexcept { return static_cast<bool>(impl_); }
@ -300,12 +300,12 @@ class MMDEPLOY_API Buffer {
};
template <typename T>
T GetNative(Buffer &buffer, ErrorCode *ec = nullptr) {
T GetNative(Buffer& buffer, ErrorCode* ec = nullptr) {
return reinterpret_cast<T>(buffer.GetNative(ec));
}
template <typename T>
T GetNative(const Buffer &buffer, ErrorCode *ec = nullptr) {
T GetNative(const Buffer& buffer, ErrorCode* ec = nullptr) {
return reinterpret_cast<T>(buffer.GetNative(ec));
}
@ -315,13 +315,15 @@ class MMDEPLOY_API PlatformRegistry {
int Register(Creator creator);
int GetPlatform(const char *name, Platform *platform);
int AddAlias(const char* name, const char* target);
int GetPlatform(int id, Platform *platform);
int GetPlatform(const char* name, Platform* platform);
int GetPlatformId(const char *name);
int GetPlatform(int id, Platform* platform);
PlatformImpl *GetPlatformImpl(PlatformId id);
int GetPlatformId(const char* name);
PlatformImpl* GetPlatformImpl(PlatformId id);
private:
int GetNextId();
@ -335,8 +337,9 @@ class MMDEPLOY_API PlatformRegistry {
Platform platform;
};
std::vector<Entry> entries_;
std::vector<std::pair<std::string, std::string>> aliases_;
};
MMDEPLOY_API PlatformRegistry &gPlatformRegistry();
MMDEPLOY_API PlatformRegistry& gPlatformRegistry();
} // namespace mmdeploy

View File

@ -321,6 +321,11 @@ int PlatformRegistry::Register(Creator creator) {
return 0;
}
int PlatformRegistry::AddAlias(const char* name, const char* target) {
aliases_.emplace_back(name, target);
return 0;
}
int PlatformRegistry::GetNextId() {
for (int i = 1;; ++i) {
if (IsAvailable(i)) {
@ -339,6 +344,12 @@ bool PlatformRegistry::IsAvailable(int id) {
}
int PlatformRegistry::GetPlatform(const char* name, Platform* platform) {
for (const auto& alias : aliases_) {
if (name == alias.first) {
name = alias.second.c_str();
break;
}
}
for (const auto& entry : entries_) {
if (entry.name == name) {
*platform = entry.platform;
@ -357,7 +368,14 @@ int PlatformRegistry::GetPlatform(int id, Platform* platform) {
}
return -1;
}
int PlatformRegistry::GetPlatformId(const char* name) {
for (const auto& alias : aliases_) {
if (name == alias.first) {
name = alias.second.c_str();
break;
}
}
for (const auto& entry : entries_) {
if (entry.name == name) {
return entry.id;

View File

@ -94,17 +94,23 @@ class Span {
constexpr Span& operator=(const Span& other) noexcept = default;
friend bool operator==(const Span& a, const Span& b) {
if (a.size() != b.size()) return false;
template <typename U>
friend bool operator!=(const Span& a, const Span<U>& b) {
if (a.size() != b.size()) {
return true;
}
for (size_type i = 0; i < a.size(); ++i) {
if (a[i] != b[i]) {
return false;
return true;
}
}
return true;
return false;
}
friend bool operator!=(const Span& a, const Span& b) { return !(a == b); }
template <typename U>
friend bool operator==(const Span& a, const Span<U>& b) {
return !(a != b);
}
private:
T* data_;

View File

@ -115,9 +115,9 @@ Result<void> Tensor::CopyFrom(const Tensor& tensor, Stream stream) {
if (!stream) {
auto device = desc_.device.is_device() ? desc_.device : tensor.desc().device;
auto default_stream = Stream::GetDefault(device);
OUTCOME_TRY(default_stream.Copy(tensor.buffer(), buffer_));
OUTCOME_TRY(default_stream.Copy(tensor.buffer(), buffer_, tensor.byte_size()));
} else {
OUTCOME_TRY(stream.Copy(tensor.buffer(), buffer_));
OUTCOME_TRY(stream.Copy(tensor.buffer(), buffer_, tensor.byte_size()));
}
return success();
}
@ -141,9 +141,9 @@ Result<void> Tensor::CopyTo(Tensor& tensor, Stream stream) const {
if (!stream) {
Device device = desc_.device.is_device() ? desc_.device : tensor.desc().device;
Stream default_stream = Stream::GetDefault(device);
return default_stream.Copy(buffer_, tensor.buffer());
return default_stream.Copy(buffer_, tensor.buffer(), byte_size());
} else {
return stream.Copy(buffer_, tensor.buffer());
return stream.Copy(buffer_, tensor.buffer(), byte_size());
}
}
@ -158,9 +158,9 @@ Result<void> Tensor::CopyFrom(void* host_ptr, Stream stream) {
Allocate();
if (!stream) {
auto default_stream = Stream::GetDefault(desc_.device);
return default_stream.Copy(host_ptr, buffer_, buffer_.GetSize());
return default_stream.Copy(host_ptr, buffer_, byte_size());
} else {
return stream.Copy(host_ptr, buffer_, buffer_.GetSize());
return stream.Copy(host_ptr, buffer_, byte_size());
}
}
@ -174,9 +174,9 @@ Result<void> Tensor::CopyTo(void* host_ptr, Stream stream) const {
}
if (!stream) {
auto default_stream = Stream::GetDefault(desc_.device);
return default_stream.Copy(buffer_, host_ptr, buffer_.GetSize());
return default_stream.Copy(buffer_, host_ptr, byte_size());
} else {
return stream.Copy(buffer_, host_ptr, buffer_.GetSize());
return stream.Copy(buffer_, host_ptr, byte_size());
}
}

View File

@ -5,3 +5,7 @@ add_subdirectory(cpu)
if ("cuda" IN_LIST MMDEPLOY_TARGET_DEVICES)
add_subdirectory(cuda)
endif ()
if ("acl" IN_LIST MMDEPLOY_TARGET_BACKENDS)
add_subdirectory(acl)
endif ()

View File

@ -0,0 +1,7 @@
# Copyright (c) OpenMMLab. All rights reserved.
project(mmdeploy_acl_device)
file(GLOB_RECURSE SRCS "*.cpp")
mmdeploy_add_module(${PROJECT_NAME} "${SRCS}")

View File

@ -0,0 +1,14 @@
// Copyright (c) OpenMMLab. All rights reserved.
#include "mmdeploy/core/device_impl.h"
namespace mmdeploy {
class AclPlatformRegisterer {
public:
AclPlatformRegisterer() { gPlatformRegistry().AddAlias("npu", "cpu"); }
};
AclPlatformRegisterer g_acl_platform_registerer;
} // namespace mmdeploy

View File

@ -105,7 +105,7 @@ Result<void> CpuPlatformImpl::CopyImpl(const void* src, void* dst, size_t src_si
task();
return success();
}
if (st.GetDevice() != Device(0, 0)) {
if (st.GetDevice().platform_id() != 0) {
return Status(eInvalidArgument);
}
auto cpu_stream = static_cast<CpuStreamImpl*>(st.GetNative());
@ -126,6 +126,7 @@ Result<void> CpuPlatformImpl::Copy(const void* host_ptr, Buffer dst, size_t size
}
return CopyImpl(host_ptr, dst_ptr, size, dst.GetSize(), 0, dst_offset, size, stream);
}
Result<void> CpuPlatformImpl::Copy(Buffer src, void* host_ptr, size_t size, size_t src_offset,
Stream stream) {
auto src_ptr = src.GetNative();
@ -145,7 +146,7 @@ Result<void> CpuPlatformImpl::Copy(Buffer src, Buffer dst, size_t size, size_t s
return Status(eInvalidArgument);
}
auto device = src.GetDevice();
if (device.platform_id() != 0 || device != dst.GetDevice()) {
if (device.platform_id() != 0 || device.platform_id() != dst.GetDevice().platform_id()) {
return Status(eInvalidArgument);
}
return CopyImpl(src_ptr, dst_ptr, src.GetSize(), dst.GetSize(), src_offset, dst_offset, size,

View File

@ -26,6 +26,10 @@ if ("snpe" IN_LIST MMDEPLOY_TARGET_BACKENDS)
add_subdirectory(snpe)
endif ()
if ("acl" IN_LIST MMDEPLOY_TARGET_BACKENDS)
add_subdirectory(acl)
endif ()
if ("torchscript" IN_LIST MMDEPLOY_TARGET_BACKENDS)
add_subdirectory(torchscript)
endif ()

View File

@ -0,0 +1,14 @@
# Copyright (c) OpenMMLab. All rights reserved.
project(mmdeploy_acl_net)
if ("acl" IN_LIST MMDEPLOY_TARGET_BACKENDS)
if (NOT DEFINED ASCEND_TOOLKIT_HOME)
set(ASCEND_TOOLKIT_HOME $ENV{ASCEND_TOOLKIT_HOME})
endif ()
mmdeploy_add_module(${PROJECT_NAME} acl_net.cpp)
target_include_directories(${PROJECT_NAME} PRIVATE
$<BUILD_INTERFACE:${ASCEND_TOOLKIT_HOME}/runtime/include>)
target_link_libraries(${PROJECT_NAME} PRIVATE
$<BUILD_INTERFACE:${ASCEND_TOOLKIT_HOME}/runtime/lib64/stub/libascendcl.so>)
endif ()

View File

@ -0,0 +1,659 @@
// Copyright (c) OpenMMLab. All rights reserved.
#include "mmdeploy/net/acl/acl_net.h"
#include "mmdeploy/core/logger.h"
#include "mmdeploy/core/model.h"
#include "mmdeploy/core/utils/formatter.h"
std::ostream& operator<<(std::ostream& os, const aclmdlIODims& dims) {
os << dims.name << " [";
for (int i = 0; i < dims.dimCount; ++i) {
os << (i ? ", " : "") << dims.dims[i];
}
os << "]";
return os;
}
std::ostream& operator<<(std::ostream& os, const aclmdlBatch& batch) {
os << "batch [";
for (int i = 0; i < batch.batchCount; ++i) {
os << (i ? ", " : "") << batch.batch[i];
}
os << "]";
return os;
}
std::ostream& operator<<(std::ostream& os, const aclmdlHW& hw) {
os << "HW [";
for (int i = 0; i < hw.hwCount; ++i) {
os << (i ? ", " : "") << "(" << hw.hw[i][0] << ", " << hw.hw[i][1] << ")";
}
os << "]";
return os;
}
namespace mmdeploy {
namespace {
inline Result<void> _m(aclError ec, SourceLocation loc = SourceLocation::current()) {
if (ec == ACL_SUCCESS) {
return success();
} else {
return Status(eFail, loc);
}
}
template <typename T>
inline Result<T*> _p(T* ptr, SourceLocation loc = SourceLocation::current()) {
if (ptr) {
return ptr;
} else {
return Status(eFail, loc);
}
}
struct Context {
Context() {
std::lock_guard lock{mutex_};
if (ref_count_++ != 0) {
return;
}
auto ret = aclInit(nullptr);
if (ret == ACL_SUCCESS) {
MMDEPLOY_INFO("ACL initialized.");
owned_acl_ = true;
} else if (ret == ACL_ERROR_REPEAT_INITIALIZE) {
MMDEPLOY_INFO("ACL has already been initialized.");
} else {
MMDEPLOY_ERROR("aclInit() failed: {}", ret);
assert(ret == 0);
}
}
~Context() {
std::lock_guard lock{mutex_};
if (--ref_count_ != 0) {
return;
}
// skip aclFinalize if aclInit is not successfully called by us.
if (owned_acl_) {
auto ret = aclFinalize();
if (ret == ACL_SUCCESS) {
MMDEPLOY_INFO("ACL finalized.");
owned_acl_ = false;
} else if (ret == ACL_ERROR_REPEAT_FINALIZE) {
MMDEPLOY_INFO("ACL has already been finalized.");
} else {
MMDEPLOY_ERROR("aclFinalize() failed: {}", ret);
}
}
}
static bool owned_acl_;
static int ref_count_;
static std::mutex mutex_;
};
bool Context::owned_acl_ = false;
int Context::ref_count_ = 0;
std::mutex Context::mutex_{};
} // namespace
AclNet::~AclNet() {
auto dtor = [&]() -> Result<void> {
auto n_inputs = aclmdlGetDatasetNumBuffers(input_dataset_);
for (int i = 0; i < n_inputs; ++i) {
auto buffer = aclmdlGetDatasetBuffer(input_dataset_, i);
auto data = aclGetDataBufferAddr(buffer);
OUTCOME_TRY(_m(aclrtFree(data)));
}
input_tensor_.clear();
OUTCOME_TRY(_m(aclmdlDestroyDataset(input_dataset_)));
auto n_outputs = aclmdlGetDatasetNumBuffers(output_dataset_);
for (int i = 0; i < n_outputs; ++i) {
auto buffer = aclmdlGetDatasetBuffer(output_dataset_, i);
auto data = aclGetDataBufferAddr(buffer);
OUTCOME_TRY(_m(aclrtFree(data)));
}
output_tensor_.clear();
OUTCOME_TRY(_m(aclmdlDestroyDataset(output_dataset_)));
OUTCOME_TRY(_m(aclmdlDestroyDesc(model_desc_)));
OUTCOME_TRY(_m(aclmdlUnload(model_id_)));
return success();
};
if (auto r = dtor(); !r) {
MMDEPLOY_ERROR("uninit failed: {}", r.error().message().c_str());
}
}
namespace {
Result<DataType> FromAclDataType(aclDataType data_type) {
switch (data_type) {
case ACL_FLOAT:
return DataType::kFLOAT;
case ACL_FLOAT16:
return DataType::kHALF;
case ACL_INT8:
return DataType::kINT8;
case ACL_INT32:
return DataType::kINT32;
case ACL_INT64:
return DataType::kINT64;
default:
return Status(eNotSupported);
}
}
Result<aclDataType> ToAclDataType(DataType data_type) {
switch (data_type) {
case DataType::kFLOAT:
return ACL_FLOAT;
case DataType::kHALF:
return ACL_FLOAT16;
case DataType::kINT8:
return ACL_INT8;
case DataType::kINT32:
return ACL_INT32;
case DataType::kINT64:
return ACL_INT64;
default:
return Status(eNotSupported);
}
}
Result<TensorDesc> ToTensorDesc(const aclmdlIODims& dims, aclDataType data_type) {
auto extract_name = [](const std::string& name) {
if (auto pos = name.find_last_of(':'); pos != std::string::npos) {
return name.substr(pos + 1);
} else {
return name;
}
};
OUTCOME_TRY(auto _data_type, FromAclDataType(data_type));
return TensorDesc{Device(0), _data_type,
TensorShape(&dims.dims[0], &dims.dims[0] + dims.dimCount),
extract_name(dims.name)};
}
Result<size_t> GetByteSize(const aclmdlIODims& dims, aclDataType data_type) {
size_t byte_size = aclDataTypeSize(data_type);
for (int i = 0; i < dims.dimCount; ++i) {
if (dims.dims[i] < 0) {
return Status(eInvalidArgument);
}
byte_size *= dims.dims[i];
}
return byte_size;
}
} // namespace
// all dims must be fixed
auto AclNet::CreateBuffers(const aclmdlIODims& dims, aclDataType data_type) -> Result<Buffers> {
OUTCOME_TRY(auto byte_size, GetByteSize(dims, data_type));
Buffers pair{};
void* dev_ptr{};
OUTCOME_TRY(_m(aclrtMalloc(&dev_ptr, byte_size, ACL_MEM_MALLOC_HUGE_FIRST)));
OUTCOME_TRY(_m(aclrtMemset(dev_ptr, byte_size, 0, byte_size)));
OUTCOME_TRY(pair.device_buffer, _p(aclCreateDataBuffer(dev_ptr, byte_size)));
OUTCOME_TRY(auto desc, ToTensorDesc(dims, data_type));
void* host_ptr{};
OUTCOME_TRY(_m(aclrtMallocHost(&host_ptr, byte_size)));
memset(host_ptr, 0, byte_size);
pair.host_tensor =
Tensor(desc, std::shared_ptr<void>(host_ptr, [](void* p) { aclrtFreeHost(p); }));
return pair;
}
auto AclNet::CreateBuffersDynamicBatchSize(aclmdlIODims dims, aclDataType data_type)
-> Result<Buffers> {
for (int i = 0; i < dims.dimCount; ++i) {
if (dims.dims[i] == -1) {
dims.dims[i] = dynamic_batch_size_.back();
}
}
return CreateBuffers(dims, data_type);
}
auto AclNet::CreateBuffersDynamicImageSize(int index, aclmdlIODims dims, aclDataType data_type)
-> Result<Buffers> {
aclmdlHW hw_desc{};
OUTCOME_TRY(_m(aclmdlGetDynamicHW(model_desc_, index, &hw_desc)));
if (hw_desc.hwCount > 0) {
auto& val = *std::max_element(hw_desc.hw, hw_desc.hw + hw_desc.hwCount,
[](auto u, auto v) { return u[0] * u[1] < v[0] * v[1]; });
int ptr = 0;
for (int i = 0; i < dims.dimCount; ++i) {
if (dims.dims[i] == -1) {
if (ptr == 2) {
return Status(eInvalidArgument);
}
dims.dims[i] = val[ptr++];
}
}
if (ptr != 2) {
return Status(eInvalidArgument);
}
}
return CreateBuffers(dims, data_type);
}
auto AclNet::CreateBuffersDynamicDims(int index, int dim_count, const aclmdlIODims& dims,
aclDataType data_type) -> Result<Buffers> {
int max_index = -1;
size_t max_value = 0;
aclmdlIODims max_shape{};
for (int j = 0; j < dynamic_input_dims_.size(); ++j) {
aclmdlIODims shape{};
strncpy(shape.name, dims.name, sizeof(shape.name));
shape.dimCount = dims.dimCount;
std::copy(dynamic_input_dims_[j].dims + dim_count,
dynamic_input_dims_[j].dims + dim_count + dims.dimCount, shape.dims);
OUTCOME_TRY(auto byte_size, GetByteSize(shape, data_type));
if (byte_size > max_value) {
max_index = j;
max_value = byte_size;
max_shape = shape;
}
}
if (max_index < 0) {
return Status(eInvalidArgument);
}
MMDEPLOY_INFO("max shape for input {}: {}", index, max_shape);
return CreateBuffers(max_shape, data_type);
}
Result<void> AclNet::ConfigDynamicShapes() {
aclError status = ACL_SUCCESS;
{
size_t dynamic_tensor_index{};
status = aclmdlGetInputIndexByName(model_desc_, ACL_DYNAMIC_TENSOR_NAME, &dynamic_tensor_index);
if (status == ACL_SUCCESS) {
dynamic_tensor_index_ = static_cast<int>(dynamic_tensor_index);
MMDEPLOY_INFO("dynamic tensor index: {}", dynamic_tensor_index);
}
}
if (dynamic_tensor_index_ >= 0) {
aclmdlBatch batch_desc{};
status = aclmdlGetDynamicBatch(model_desc_, &batch_desc);
if (status == ACL_SUCCESS && batch_desc.batchCount > 0) {
MMDEPLOY_INFO("{}, status = {}", batch_desc, status);
input_shape_type_ = kDynamicBatchSize;
dynamic_batch_size_.insert(dynamic_batch_size_.end(), batch_desc.batch,
batch_desc.batch + batch_desc.batchCount);
std::sort(dynamic_batch_size_.begin(), dynamic_batch_size_.end());
}
size_t dynamic_gear_count{0};
if (input_shape_type_ == kStatic) {
status = aclmdlGetInputDynamicGearCount(model_desc_, -1, &dynamic_gear_count);
dynamic_input_dims_.resize(dynamic_gear_count);
if (status == ACL_SUCCESS && dynamic_gear_count > 0) {
status = aclmdlGetInputDynamicDims(model_desc_, -1, dynamic_input_dims_.data(),
dynamic_gear_count);
for (const auto& dims : dynamic_input_dims_) {
MMDEPLOY_INFO("dynamic input dims: {}", dims);
}
input_shape_type_ = kDynamicDims;
} else {
input_shape_type_ = kDynamicImageSize;
}
}
}
return success();
}
Result<void> AclNet::CreateInputBuffers() {
input_dataset_ = aclmdlCreateDataset();
auto n_inputs = aclmdlGetNumInputs(model_desc_);
MMDEPLOY_INFO("n_inputs = {}, dynamic_tensor_index_ = {}", n_inputs, dynamic_tensor_index_);
int dim_count = 0;
for (int i = 0; i < n_inputs; ++i) {
if (i == dynamic_tensor_index_) {
void* data{};
auto input_len = aclmdlGetInputSizeByIndex(model_desc_, i);
OUTCOME_TRY(_m(aclrtMalloc(&data, input_len, ACL_MEM_MALLOC_HUGE_FIRST)));
OUTCOME_TRY(auto buffer, _p(aclCreateDataBuffer(data, input_len)));
OUTCOME_TRY(_m(aclmdlAddDatasetBuffer(input_dataset_, buffer)));
} else {
Buffers buffers{};
aclmdlIODims dims{};
OUTCOME_TRY(_m(aclmdlGetInputDims(model_desc_, i, &dims)));
input_dims_.push_back(dims);
auto data_type = aclmdlGetInputDataType(model_desc_, i);
input_data_type_.push_back(data_type);
MMDEPLOY_INFO("{}", dims);
switch (input_shape_type_) {
case kStatic: {
OUTCOME_TRY(buffers, CreateBuffers(dims, data_type));
break;
}
case kDynamicBatchSize: {
OUTCOME_TRY(buffers, CreateBuffersDynamicBatchSize(dims, data_type));
break;
}
case kDynamicImageSize: {
OUTCOME_TRY(buffers, CreateBuffersDynamicImageSize(i, dims, data_type));
break;
}
case kDynamicDims: {
OUTCOME_TRY(buffers, CreateBuffersDynamicDims(i, dim_count, dims, data_type));
break;
}
default:
return Status(eInvalidArgument);
}
OUTCOME_TRY(_m(aclmdlAddDatasetBuffer(input_dataset_, buffers.device_buffer)));
input_tensor_.push_back(std::move(buffers.host_tensor));
dim_count += dims.dimCount;
}
}
return success();
}
Result<void> AclNet::CreateOutputBuffers() {
output_dataset_ = aclmdlCreateDataset();
auto n_outputs = aclmdlGetNumOutputs(model_desc_);
std::vector<aclmdlIODims> output_dims;
for (int i = 0; i < n_outputs; ++i) {
aclmdlIODims dims{};
OUTCOME_TRY(_m(aclmdlGetOutputDims(model_desc_, i, &dims))); // return max dims
output_dims_.push_back(dims);
MMDEPLOY_INFO("{}", dims);
auto data_type = aclmdlGetOutputDataType(model_desc_, i);
output_data_type_.push_back(data_type);
OUTCOME_TRY(auto buffers, CreateBuffers(dims, data_type));
OUTCOME_TRY(_m(aclmdlAddDatasetBuffer(output_dataset_, buffers.device_buffer)));
output_tensor_.push_back(std::move(buffers.host_tensor));
}
return success();
}
Result<void> AclNet::Init(const Value& args) {
auto& context = args["context"];
cpu_stream_ = context["stream"].get<Stream>();
auto name = args["name"].get<std::string>();
auto model = context["model"].get<Model>();
device_id_ = context["device"].get<Device>().device_id();
acl_context_ = std::make_shared<Context>();
OUTCOME_TRY(auto config, model.GetModelConfig(name));
OUTCOME_TRY(auto binary, model.ReadFile(config.net));
OUTCOME_TRY(_m(aclrtSetDevice(device_id_)));
OUTCOME_TRY(_m(aclmdlLoadFromMem(binary.data(), binary.size(), &model_id_)));
model_desc_ = aclmdlCreateDesc();
OUTCOME_TRY(_m(aclmdlGetDesc(model_desc_, model_id_)));
// dynamic_tensor_index_
// input_shape_type_
// dynamic_batch_size_
// dynamic_input_dims_
if (auto r = ConfigDynamicShapes(); !r) {
MMDEPLOY_ERROR("Failed to config dynamic shapes");
return r.as_failure();
}
// input_dataset_
// input_data_type_
// input_dims_
// input_tensor_
if (auto r = CreateInputBuffers(); !r) {
MMDEPLOY_ERROR("Failed to create input buffers");
return r.as_failure();
}
// output_dataset_
// output_data_type_
// output_dims_
// output_tensor_
if (auto r = CreateOutputBuffers(); !r) {
MMDEPLOY_ERROR("Failed to create output buffers");
return r.as_failure();
}
return success();
}
Result<void> AclNet::Deinit() { return success(); }
Result<Span<Tensor>> AclNet::GetInputTensors() { return input_tensor_; }
Result<Span<Tensor>> AclNet::GetOutputTensors() { return output_tensor_; }
Result<void> AclNet::Reshape(Span<TensorShape> input_shapes) {
OUTCOME_TRY(_m(aclrtSetDevice(device_id_)));
// Sanity checks
if (input_shapes.size() != input_dims_.size()) {
MMDEPLOY_ERROR("inconsistent num inputs");
return Status(eInvalidArgument);
}
for (int i = 0; i < input_dims_.size(); ++i) {
if (input_shapes[i].size() != input_dims_[i].dimCount) {
MMDEPLOY_ERROR("inconsistent num of dims");
return Status(eInvalidArgument);
}
}
switch (input_shape_type_) {
case kStatic: {
OUTCOME_TRY(ReshapeStatic(input_shapes));
break;
}
case kDynamicBatchSize: {
OUTCOME_TRY(ReshapeDynamicBatchSize(input_shapes));
break;
}
case kDynamicImageSize: {
OUTCOME_TRY(ReshapeDynamicImageSize(input_shapes));
break;
}
case kDynamicDims: {
OUTCOME_TRY(ReshapeDynamicDims(input_shapes));
break;
}
default:
return Status(eInvalidArgument);
}
for (int i = 0; i < input_shapes.size(); ++i) {
auto buffer = input_tensor_[i].buffer();
auto desc = input_tensor_[i].desc();
desc.shape = input_shapes[i];
input_tensor_[i] = Tensor(std::move(desc), std::move(buffer));
}
for (int i = 0; i < output_dims_.size(); ++i) {
aclmdlIODims dims{};
OUTCOME_TRY(_m(aclmdlGetCurOutputDims(model_desc_, i, &dims)));
auto buffer = output_tensor_[i].buffer();
auto desc = output_tensor_[i].desc();
desc.shape = TensorShape(&dims.dims[0], &dims.dims[0] + dims.dimCount);
output_tensor_[i] = Tensor(std::move(desc), std::move(buffer));
}
return success();
}
Result<void> AclNet::ReshapeStatic(Span<TensorShape> input_shapes) {
for (int i = 0; i < input_dims_.size(); ++i) {
Span src(input_shapes[i]);
Span ref(input_dims_[i].dims, input_dims_[i].dimCount);
if (src != ref) {
MMDEPLOY_ERROR("Shape mismatch {} vs {}", src, ref);
return Status(eInvalidArgument);
}
}
return success();
}
Result<void> AclNet::ReshapeDynamicBatchSize(Span<TensorShape> input_shapes) {
int batch_size = -1;
for (int i = 0; i < input_dims_.size(); ++i) {
for (int j = 0; j < input_dims_[i].dimCount; ++j) {
if (input_dims_[i].dims[j] == -1) {
if (batch_size != -1 && batch_size != input_shapes[i][j]) {
// inconsistent batch size
return Status(eInvalidArgument);
}
batch_size = input_shapes[i][j];
}
}
}
if (batch_size < 0) {
MMDEPLOY_ERROR("unable to determine batch size");
return Status(eFail);
}
MMDEPLOY_INFO("batch size {} {}", batch_size, dynamic_tensor_index_);
auto index =
std::lower_bound(dynamic_batch_size_.begin(), dynamic_batch_size_.end(), batch_size) -
dynamic_batch_size_.begin();
if (index == dynamic_batch_size_.size()) {
MMDEPLOY_ERROR("Unsupported batch size: {}", batch_size);
}
// TODO: memset padding memory to avoid potential extra computation
OUTCOME_TRY(_m(aclmdlSetDynamicBatchSize(model_id_, input_dataset_, dynamic_tensor_index_,
dynamic_batch_size_[index])));
return success();
}
Result<void> AclNet::ReshapeDynamicImageSize(Span<TensorShape> input_shapes) {
uint64_t hw[2];
bool found = false;
for (int i = 0; i < input_dims_.size(); ++i) {
uint64_t tmp[2];
int ptr = 0;
for (int j = 0; j < input_dims_[i].dimCount; ++j) {
if (input_dims_[i].dims[j] == -1) {
if (ptr == 2) {
MMDEPLOY_ERROR("dynamic HW size out of bounds: {}", input_dims_[i]);
return Status(eInvalidArgument);
} else {
tmp[ptr++] = input_shapes[i][j];
}
}
}
if (ptr && ptr != 2) {
MMDEPLOY_ERROR("Partially determined dynamic HW size: {}", input_dims_[i]);
return Status(eInvalidArgument);
}
if (ptr == 2) {
if (found) {
if (hw[0] != tmp[0] || hw[1] != tmp[1]) {
MMDEPLOY_ERROR("Inconsistent dynamic HW size: ({}, {}) vs ({}, {})", hw[0], hw[1], tmp[0],
tmp[1]);
return Status(eInvalidArgument);
}
} else {
found = true;
hw[0] = tmp[0];
hw[1] = tmp[1];
}
}
}
if (!found) {
MMDEPLOY_ERROR("Unable to determine image size");
return Status(eInvalidArgument);
}
MMDEPLOY_INFO("dynamic HW size ({}, {})", hw[0], hw[1]);
OUTCOME_TRY(
_m(aclmdlSetDynamicHWSize(model_id_, input_dataset_, dynamic_tensor_index_, hw[0], hw[1])));
return success();
}
Result<void> AclNet::ReshapeDynamicDims(Span<TensorShape> input_shapes) {
std::vector<int> match(dynamic_input_dims_.size(), 1);
aclmdlIODims dims{};
for (int i = 0; i < input_shapes.size(); ++i) {
const auto& shape = input_shapes[i];
for (int j = 0; j < shape.size(); ++j) {
if (input_dims_[i].dims[j] == -1) {
for (int k = 0; k < dynamic_input_dims_.size(); ++k) {
// disable profile when dims mismatch, except for the first dim (batch size)
if (j == 0 && shape[j] < dynamic_input_dims_[k].dims[dims.dimCount]) {
// pass
} else if (shape[j] != dynamic_input_dims_[k].dims[dims.dimCount]) {
match[k] = 0;
}
}
} else {
if (input_dims_[i].dims[j] != shape[j]) {
return Status(eNotSupported);
}
}
dims.dims[dims.dimCount++] = shape[j];
}
}
int dims_index = std::find(match.begin(), match.end(), 1) - match.begin();
if (dims_index == match.size()) {
MMDEPLOY_ERROR("Shape not supported: {}", dims);
return Status(eNotSupported);
}
// TODO: memset padding memory to avoid potential extra computation
OUTCOME_TRY(_m(aclmdlSetInputDynamicDims(model_id_, input_dataset_, dynamic_tensor_index_,
&dynamic_input_dims_[dims_index])));
return success();
}
Result<void> AclNet::Forward() {
OUTCOME_TRY(cpu_stream_.Wait());
OUTCOME_TRY(_m(aclrtSetDevice(device_id_)));
for (int i = 0; i < input_tensor_.size(); ++i) {
auto buffer = aclmdlGetDatasetBuffer(input_dataset_, i);
auto buffer_size = aclGetDataBufferSizeV2(buffer);
auto buffer_data = aclGetDataBufferAddr(buffer);
auto host_ptr = input_tensor_[i].data();
OUTCOME_TRY(_m(aclrtMemcpy(buffer_data, buffer_size, host_ptr, input_tensor_[i].byte_size(),
ACL_MEMCPY_HOST_TO_DEVICE)));
}
OUTCOME_TRY(_m(aclmdlExecute(model_id_, input_dataset_, output_dataset_)));
for (int i = 0; i < output_tensor_.size(); ++i) {
auto buffer = aclmdlGetDatasetBuffer(output_dataset_, i);
auto buffer_data = aclGetDataBufferAddr(buffer);
auto host_ptr = output_tensor_[i].data();
OUTCOME_TRY(_m(aclrtMemcpy(host_ptr, output_tensor_[i].byte_size(), buffer_data,
output_tensor_[i].byte_size(), ACL_MEMCPY_DEVICE_TO_HOST)));
}
return success();
}
Result<void> AclNet::ForwardAsync(Event* event) { return Status(eNotSupported); }
class AclNetCreator : public Creator<Net> {
public:
const char* GetName() const override { return "ascend"; }
int GetVersion() const override { return 0; }
std::unique_ptr<Net> Create(const Value& args) override {
try {
auto p = std::make_unique<AclNet>();
if (auto r = p->Init(args)) {
return p;
} else {
MMDEPLOY_ERROR("error creating AclNet: {}", r.error().message().c_str());
return nullptr;
}
} catch (const std::exception& e) {
MMDEPLOY_ERROR("unhandled exception when creating AclNet: {}", e.what());
return nullptr;
}
}
};
REGISTER_MODULE(Net, AclNetCreator);
} // namespace mmdeploy

View File

@ -0,0 +1,70 @@
// Copyright (c) OpenMMLab. All rights reserved.
#ifndef MMDEPLOY_SRC_NET_ACL_ACL_NET_H_
#define MMDEPLOY_SRC_NET_ACL_ACL_NET_H_
#include "acl/acl.h"
#include "mmdeploy/core/net.h"
#include "mmdeploy/core/status_code.h"
namespace mmdeploy {
class AclNet : public Net {
public:
~AclNet() override;
Result<void> Init(const Value& cfg) override;
Result<void> Deinit() override;
Result<Span<Tensor>> GetInputTensors() override;
Result<Span<Tensor>> GetOutputTensors() override;
Result<void> Reshape(Span<TensorShape> input_shapes) override;
Result<void> Forward() override;
Result<void> ForwardAsync(Event* event) override;
private:
enum InputShapeType { kStatic, kDynamicBatchSize, kDynamicImageSize, kDynamicDims };
Result<void> ReshapeStatic(Span<TensorShape> input_shapes);
Result<void> ReshapeDynamicBatchSize(Span<TensorShape> input_shapes);
Result<void> ReshapeDynamicImageSize(Span<TensorShape> input_shapes);
Result<void> ReshapeDynamicDims(Span<TensorShape> input_shapes);
struct Buffers {
aclDataBuffer* device_buffer;
Tensor host_tensor;
};
Result<Buffers> CreateBuffers(const aclmdlIODims& dims, aclDataType data_type);
Result<Buffers> CreateBuffersDynamicBatchSize(aclmdlIODims dims, aclDataType data_type);
Result<Buffers> CreateBuffersDynamicImageSize(int index, aclmdlIODims dims,
aclDataType data_type);
Result<Buffers> CreateBuffersDynamicDims(int index, int dim_count, const aclmdlIODims& dims,
aclDataType data_type);
Result<void> ConfigDynamicShapes();
Result<void> CreateInputBuffers();
Result<void> CreateOutputBuffers();
std::shared_ptr<void> acl_context_;
Stream cpu_stream_;
int32_t device_id_{0};
uint32_t model_id_{(uint32_t)-1};
aclmdlDesc* model_desc_{nullptr};
int dynamic_tensor_index_{-1};
InputShapeType input_shape_type_{kStatic};
std::vector<size_t> dynamic_batch_size_;
std::vector<aclmdlIODims> dynamic_input_dims_;
aclmdlDataset* input_dataset_{nullptr};
aclmdlDataset* output_dataset_{nullptr};
std::vector<aclmdlIODims> input_dims_;
std::vector<aclmdlIODims> output_dims_;
std::vector<aclDataType> input_data_type_;
std::vector<aclDataType> output_data_type_;
std::vector<Tensor> input_tensor_;
std::vector<Tensor> output_tensor_;
};
} // namespace mmdeploy
#endif // MMDEPLOY_SRC_NET_ACL_ACL_NET_H_

View File

@ -35,6 +35,7 @@ PadImpl::PadImpl(const Value& args) : TransformImpl(args) {
}
arg_.pad_to_square = args.value("pad_to_square", false);
arg_.padding_mode = args.value("padding_mode", std::string("constant"));
arg_.orientation_agnostic = args.value("orientation_agnostic", false);
}
Result<Value> PadImpl::Process(const Value& input) {
@ -58,9 +59,19 @@ Result<Value> PadImpl::Process(const Value& input) {
output["pad_fixed_size"].push_back(max_size);
output["pad_fixed_size"].push_back(max_size);
} else if (arg_.size[0] != 0 && arg_.size[1] != 0) {
padding = {0, 0, arg_.size[1] - width, arg_.size[0] - height};
output["pad_fixed_size"].push_back(arg_.size[0]);
output["pad_fixed_size"].push_back(arg_.size[1]);
if (arg_.orientation_agnostic) {
auto size_min = min(arg_.size[0], arg_.size[1]);
auto size_max = max(arg_.size[0], arg_.size[1]);
auto pad_h = width < height ? size_max : size_min;
auto pad_w = width < height ? size_min : size_max;
padding = {0, 0, pad_w - width, pad_h - height};
output["pad_fixed_size"].push_back(pad_h);
output["pad_fixed_size"].push_back(pad_w);
} else {
padding = {0, 0, arg_.size[1] - width, arg_.size[0] - height};
output["pad_fixed_size"].push_back(arg_.size[0]);
output["pad_fixed_size"].push_back(arg_.size[1]);
}
} else if (arg_.size_divisor != 1) {
auto pad_h = (height + arg_.size_divisor - 1) / arg_.size_divisor * arg_.size_divisor;
auto pad_w = (width + arg_.size_divisor - 1) / arg_.size_divisor * arg_.size_divisor;

View File

@ -29,6 +29,7 @@ class MMDEPLOY_API PadImpl : public TransformImpl {
int size_divisor;
float pad_val;
bool pad_to_square;
bool orientation_agnostic;
std::string padding_mode;
};
using ArgType = struct pad_arg_t;

View File

@ -238,6 +238,17 @@ export LD_LIBRARY_PATH=$Torch_DIR/lib:$LD_LIBRARY_PATH
</code></pre>
</td>
</tr>
<tr>
<td>Ascend</td>
<td>CANN</td>
<td>
1. Install CANN follow <a href="https://www.hiascend.com/document/detail/en/CANNCommunityEdition/51RC1alphaX/softwareinstall/instg/atlasdeploy_03_0002.html">official guide</a>.<br>
2. Setup environment
<pre><code>
export ASCEND_TOOLKIT_HOME="/usr/local/Ascend/ascend-toolkit/latest"
</code></pre>
</td>
</tr>
</tbody>
</table>

View File

@ -36,6 +36,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<th align="center" colspan="5">TensorRT(ms)</th>
<th align="center" colspan="2">PPLNN(ms)</th>
<th align="center" colspan="2">ncnn(ms)</th>
<th align="center" colspan="1">Ascend(ms)</th>
</tr>
</thead>
<tbody>
@ -48,6 +49,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center" colspan="1">T4</td>
<td align="center" colspan="1">SnapDragon888</td>
<td align="center" colspan="1">Adreno660</td>
<td align="center" colspan="1">Ascend310</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
@ -59,6 +61,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet/resnet50_b32x8_imagenet.py"> ResNet </a></td>
@ -72,6 +75,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center">1.30</td>
<td align="center">33.91</td>
<td align="center">25.93</td>
<td align="center">2.49</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py"> ResNeXt </a></td>
@ -85,6 +89,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center">1.36</td>
<td align="center">133.44</td>
<td align="center">69.38</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet/seresnet50_b32x8_imagenet.py"> SE-ResNet </a></td>
@ -98,6 +103,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center">1.91</td>
<td align="center">107.84</td>
<td align="center">80.85</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py"> ShuffleNetV2 </a></td>
@ -111,6 +117,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/pro
<td align="center">4.69</td>
<td align="center">9.55</td>
<td align="center">10.66</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
@ -419,6 +426,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<th align="center">ONNX Runtime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -432,6 +440,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet/resnet18_b32x8_imagenet.py">ResNet-18</a></td>
@ -443,6 +452,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.91</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -453,6 +463,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">89.33</td>
<td align="center">89.38</td>
<td align="center">89.34</td>
<td align="center">89.43</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py">ResNeXt-50</a></td>
@ -464,6 +475,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">-</td>
<td align="center">77.78</td>
<td align="center">77.89</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -474,6 +486,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">-</td>
<td align="center">93.64</td>
<td align="center">93.65</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py">SE-ResNet-50</a></td>
@ -485,6 +498,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.75</td>
<td align="center">77.63</td>
<td align="center">77.73</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -495,6 +509,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">93.83</td>
<td align="center">93.72</td>
<td align="center">93.84</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py">ShuffleNetV1 1.0x</a></td>
@ -506,6 +521,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">68.13</td>
<td align="center">67.71</td>
<td align="center">68.11</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -516,6 +532,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">87.81</td>
<td align="center">87.58</td>
<td align="center">87.80</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py">ShuffleNetV2 1.0x</a></td>
@ -527,6 +544,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">69.54</td>
<td align="center">69.10</td>
<td align="center">69.54</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -537,6 +555,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">88.91</td>
<td align="center">88.58</td>
<td align="center">88.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2/mobilenet_v2_b32x8_imagenet.py">MobileNet V2</a></td>
@ -548,6 +567,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">71.87</td>
<td align="center">70.91</td>
<td align="center">71.84</td>
<td align="center">71.87</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -558,6 +578,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">90.40</td>
<td align="center">89.85</td>
<td align="center">90.41</td>
<td align="center">90.42</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py">Vision Transformer</a></td>
@ -569,6 +590,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">85.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">85.43</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -579,6 +601,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">97.76</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">97.77</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/blob/master/configs/swin_transformer/swin-tiny_16xb64_in1k.py">Swin Transformer</a></td>
@ -614,6 +637,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -629,6 +653,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo/yolov3_d53_320_273e_coco.py">YOLOV3</a></td>
@ -642,6 +667,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">33.5</td>
<td align="center">33.5</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd/ssd300_coco.py">SSD</a></td>
@ -655,6 +681,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">25.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet/retinanet_r50_fpn_1x_coco.py">RetinaNet</a></td>
@ -668,6 +695,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">36.4</td>
<td align="center">36.3</td>
<td align="center">36.5</td>
<td align="center">36.4</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py">FCOS</a></td>
@ -681,6 +709,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">36.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf/fsaf_r50_fpn_1x_coco.py">FSAF</a></td>
@ -694,6 +723,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">37.4</td>
<td align="center">37.2</td>
<td align="center">37.4</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox/yolox_s_8x8_300e_coco.py">YOLOX</a></td>
@ -707,6 +737,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">40.3</td>
<td align="center">29.3</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py">Faster R-CNN</a></td>
@ -720,6 +751,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">37.3</td>
<td align="center">37.1</td>
<td align="center">37.3</td>
<td align="center">37.2</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r50_fpn_1x_coco.py">ATSS</a></td>
@ -733,6 +765,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">39.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py">Cascade R-CNN</a></td>
@ -746,6 +779,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">40.4</td>
<td align="center">-</td>
<td align="center">40.4</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl/gfl_r50_fpn_1x_coco.py">GFL</a></td>
@ -759,6 +793,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">40.0</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py">RepPoints</a></td>
@ -772,6 +807,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/detr/detr_r50_8x2_150e_coco.py">DETR</a></td>
@ -798,6 +834,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">38.1</td>
<td align="center">-</td>
<td align="center">38.0</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
@ -808,6 +845,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">33.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmdetection/blob/master/configs/swin/mask_rcnn_swin-t-p4-w7_fpn_1x_coco.py">Swin-Transformer</a></td>
@ -821,6 +859,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">37.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
@ -831,6 +870,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">35.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
@ -1216,6 +1256,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -1230,6 +1271,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py">FCN</a></td>
@ -1242,6 +1284,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">72.35</td>
<td align="center">74.19</td>
<td align="center">72.35</td>
<td align="center">72.35</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py">PSPNet</a></td>
@ -1254,6 +1297,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">78.24</td>
<td align="center">77.97</td>
<td align="center">78.09</td>
<td align="center">78.67</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3/deeplabv3_r50-d8_512x1024_40k_cityscapes.py">deeplabv3</a></td>
@ -1266,6 +1310,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">79.12</td>
<td align="center">78.96</td>
<td align="center">79.12</td>
<td align="center">79.06</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_40k_cityscapes.py">deeplabv3+</a></td>
@ -1278,6 +1323,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">79.60</td>
<td align="center">79.43</td>
<td align="center">79.60</td>
<td align="center">79.51</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn/fast_scnn_lr0.12_8x4_160k_cityscapes.py">Fast-SCNN</a></td>
@ -1290,6 +1336,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">70.92</td>
<td align="center">66.00</td>
<td align="center">70.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet/fcn_unet_s5-d16_4x4_512x1024_160k_cityscapes.py">UNet</a></td>
@ -1302,6 +1349,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">69.10</td>
<td align="center">68.95</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann/ann_r50-d8_512x1024_40k_cityscapes.py">ANN</a></td>
@ -1314,6 +1362,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet/apcnet_r50-d8_512x1024_40k_cityscapes.py">APCNet</a></td>
@ -1326,6 +1375,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes.py">BiSeNetV1</a></td>
@ -1338,6 +1388,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">74.43</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes.py">BiSeNetV2</a></td>
@ -1350,6 +1401,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">73.21</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet/cgnet_512x1024_60k_cityscapes.py">CGNet</a></td>
@ -1362,6 +1414,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">68.27</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet/emanet_r50-d8_512x1024_80k_cityscapes.py">EMANet</a></td>
@ -1374,6 +1427,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.6</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet/encnet_r50-d8_512x1024_40k_cityscapes.py">EncNet</a></td>
@ -1386,6 +1440,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">75.66</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet/erfnet_fcn_4x4_512x1024_160k_cityscapes.py">ERFNet</a></td>
@ -1398,6 +1453,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">71.07</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn/fastfcn_r50-d32_jpu_aspp_512x1024_80k_cityscapes.py">FastFCN</a></td>
@ -1410,6 +1466,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">79.12</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet/gcnet_r50-d8_512x1024_40k_cityscapes.py">GCNet</a></td>
@ -1422,6 +1479,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.69</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet/icnet_r18-d8_832x832_80k_cityscapes.py">ICNet</a></td>
@ -1434,6 +1492,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">76.36</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet/isanet_r50-d8_512x1024_40k_cityscapes.py">ISANet</a></td>
@ -1446,6 +1505,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">78.49</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet/ocrnet_hr18s_512x1024_40k_cityscapes.py">OCRNet</a></td>
@ -1458,6 +1518,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">73.67</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend/pointrend_r50_512x1024_80k_cityscapes.py">PointRend</a></td>
@ -1470,6 +1531,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">76.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn/fpn_r50_512x1024_80k_cityscapes.py">Semantic FPN</a></td>
@ -1482,6 +1544,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">74.52</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc/stdc1_in1k-pre_512x1024_80k_cityscapes.py">STDC</a></td>
@ -1494,6 +1557,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">75.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc/stdc2_in1k-pre_512x1024_80k_cityscapes.py">STDC</a></td>
@ -1506,6 +1570,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.17</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet/upernet_r50_512x1024_40k_cityscapes.py">UPerNet</a></td>
@ -1518,6 +1583,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">77.18</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/blob/master/configs/segmenter/segmenter_vit-s_linear_8x1_512x512_160k_ade20k.py">Segmenter</a></td>
@ -1530,6 +1596,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](../
<td align="center">43.34</td>
<td align="center">43.35</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>

View File

@ -2,82 +2,82 @@
The table below lists the models that are guaranteed to be exportable to other backends.
| Model | Codebase | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO | Model config |
| :-------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | :---------------------------------------------------------------------------------------------: |
| RetinaNet | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet) |
| Faster R-CNN | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) |
| YOLOv3 | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo) |
| YOLOX | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox) |
| FCOS | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos) |
| FSAF | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf) |
| Mask R-CNN | MMDetection | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn) |
| SSD[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd) |
| FoveaBox | MMDetection | Y | Y | N | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/foveabox) |
| ATSS | MMDetection | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss) |
| GFL | MMDetection | N | Y | Y | N | ? | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl) |
| Cascade R-CNN | MMDetection | N | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Cascade Mask R-CNN | MMDetection | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Swin Transformer[\*](#note) | MMDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/swin) |
| VFNet | MMDetection | N | N | N | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/vfnet) |
| RepPoints | MMDetection | N | N | Y | N | ? | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints) |
| DETR | MMDetection | N | Y | Y | N | ? | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/detr) |
| ResNet | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet) |
| ResNeXt | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext) |
| SE-ResNet | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet) |
| MobileNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2) |
| ShuffleNetV1 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1) |
| ShuffleNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2) |
| VisionTransformer | MMClassification | Y | Y | Y | Y | ? | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/vision_transformer) |
| SwinTransformer | MMClassification | Y | Y | Y | N | ? | N | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/swin_transformer) |
| FCN | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn) |
| PSPNet[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet) |
| DeepLabV3 | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3) |
| DeepLabV3+ | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus) |
| Fast-SCNN[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn) |
| UNet | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet) |
| ANN[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann) |
| APCNet | MMSegmentation | Y | Y | Y | Y | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet) |
| BiSeNetV1 | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1) |
| BiSeNetV2 | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2) |
| CGNet | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet) |
| DMNet | MMSegmentation | ? | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dmnet) |
| DNLNet | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dnlnet) |
| EMANet | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet) |
| EncNet | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet) |
| ERFNet | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet) |
| FastFCN | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn) |
| GCNet | MMSegmentation | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet) |
| ICNet[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet) |
| ISANet[\*static](#note) | MMSegmentation | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet) |
| NonLocal Net | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/nonlocal_net) |
| OCRNet | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet) |
| PointRend | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend) |
| Semantic FPN | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn) |
| STDC | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc) |
| UPerNet[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet) |
| DANet | MMSegmentation | ? | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/danet) |
| Segmenter[\*static](#note) | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/segmenter) |
| SRCNN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srcnn) |
| ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/esrgan) |
| SRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| SRResNet | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| Real-ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/real_esrgan) |
| EDSR | MMEditing | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/edsr) |
| RDN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/rdn) |
| DBNet | MMOCR | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnet) |
| PANet | MMOCR | Y | Y | Y | Y | ? | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/panet) |
| DBNet | MMOCR | Y | Y | Y | Y | ? | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/psenet) |
| CRNN | MMOCR | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/crnn) |
| SAR | MMOCR | N | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/sar) |
| SATRN | MMOCR | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/satrn) |
| HRNet | MMPose | N | Y | Y | Y | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#hrnet-cvpr-2019) |
| MSPN | MMPose | N | Y | Y | Y | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#mspn-arxiv-2019) |
| LiteHRNet | MMPose | N | Y | Y | N | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#litehrnet-cvpr-2021) |
| PointPillars | MMDetection3d | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
| CenterPoint (pillar) | MMDetection3d | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
| RotatedRetinaNet | RotatedDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) |
| Oriented RCNN | RotatedDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) |
| Gliding Vertex | RotatedDetection | N | N | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) |
| Model | Codebase | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO | Ascend | Model config |
| :-------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | :----: | :---------------------------------------------------------------------------------------------: |
| RetinaNet | MMDetection | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet) |
| Faster R-CNN | MMDetection | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) |
| YOLOv3 | MMDetection | Y | Y | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo) |
| YOLOX | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox) |
| FCOS | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos) |
| FSAF | MMDetection | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf) |
| Mask R-CNN | MMDetection | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn) |
| SSD[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd) |
| FoveaBox | MMDetection | Y | Y | N | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/foveabox) |
| ATSS | MMDetection | N | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss) |
| GFL | MMDetection | N | Y | Y | N | ? | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl) |
| Cascade R-CNN | MMDetection | N | Y | Y | N | Y | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Cascade Mask R-CNN | MMDetection | N | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Swin Transformer[\*](#note) | MMDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/swin) |
| VFNet | MMDetection | N | N | N | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/vfnet) |
| RepPoints | MMDetection | N | N | Y | N | ? | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints) |
| DETR | MMDetection | N | Y | Y | N | ? | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/detr) |
| ResNet | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet) |
| ResNeXt | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext) |
| SE-ResNet | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet) |
| MobileNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2) |
| ShuffleNetV1 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1) |
| ShuffleNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2) |
| VisionTransformer | MMClassification | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/vision_transformer) |
| SwinTransformer | MMClassification | Y | Y | Y | N | ? | N | N | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/swin_transformer) |
| FCN | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn) |
| PSPNet[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet) |
| DeepLabV3 | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3) |
| DeepLabV3+ | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus) |
| Fast-SCNN[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn) |
| UNet | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet) |
| ANN[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann) |
| APCNet | MMSegmentation | Y | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet) |
| BiSeNetV1 | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1) |
| BiSeNetV2 | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2) |
| CGNet | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet) |
| DMNet | MMSegmentation | ? | Y | N | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dmnet) |
| DNLNet | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dnlnet) |
| EMANet | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet) |
| EncNet | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet) |
| ERFNet | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet) |
| FastFCN | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn) |
| GCNet | MMSegmentation | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet) |
| ICNet[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet) |
| ISANet | MMSegmentation | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet) |
| NonLocal Net | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/nonlocal_net) |
| OCRNet | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet) |
| PointRend | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend) |
| Semantic FPN | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn) |
| STDC | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc) |
| UPerNet[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet) |
| DANet | MMSegmentation | ? | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/danet) |
| Segmenter | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/segmenter) |
| SRCNN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srcnn) |
| ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/esrgan) |
| SRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| SRResNet | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| Real-ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/real_esrgan) |
| EDSR | MMEditing | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/edsr) |
| RDN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/rdn) |
| DBNet | MMOCR | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnet) |
| PANet | MMOCR | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/panet) |
| PSENet | MMOCR | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/psenet) |
| CRNN | MMOCR | Y | Y | Y | Y | Y | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/crnn) |
| SAR[\*](#note) | MMOCR | N | Y | N | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/sar) |
| SATRN | MMOCR | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/satrn) |
| HRNet | MMPose | N | Y | Y | Y | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#hrnet-cvpr-2019) |
| MSPN | MMPose | N | Y | Y | Y | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#mspn-arxiv-2019) |
| LiteHRNet | MMPose | N | Y | Y | N | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#litehrnet-cvpr-2021) |
| PointPillars | MMDetection3d | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
| CenterPoint (pillar) | MMDetection3d | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
| RotatedRetinaNet | RotatedDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) |
| Oriented RCNN | RotatedDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) |
| Gliding Vertex | RotatedDetection | N | N | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) |
### Note
@ -85,4 +85,5 @@ The table below lists the models that are guaranteed to be exportable to other b
- static: This model only support static export. Please use `static` deploy config, just like $MMDEPLOY_DIR/configs/mmseg/segmentation_tensorrt_static-1024x2048.py.
- SSD: When you convert SSD model, you need to use min shape deploy config just like 300x300-512x512 rather than 320x320-1344x1344, for example $MMDEPLOY_DIR/configs/mmdet/detection/detection_tensorrt_dynamic-300x300-512x512.py.
- YOLOX: YOLOX with ncnn only supports static shape.
- Swin Transformer: For TensorRT, only version 8.4+ is supported.
- SAR: Chinese text recognition model is not supported as the protobuf size of ONNX is limited.

View File

@ -235,6 +235,17 @@ export LD_LIBRARY_PATH=$Torch_DIR/lib:$LD_LIBRARY_PATH
</code></pre>
</td>
</tr>
<tr>
<td>Ascend</td>
<td>CANN</td>
<td>
1. 按照 <a href="https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/60RC1alpha02/softwareinstall/instg/atlasdeploy_03_0002.html">官方指引</a> 安装 CANN 工具集.<br>
2. 配置环境
<pre><code>
export ASCEND_TOOLKIT_HOME="/usr/local/Ascend/ascend-toolkit/latest"
</code></pre>
</td>
</tr>
</tbody>
</table>

View File

@ -33,6 +33,7 @@ GPU: ncnn, TensorRT, PPLNN
<th align="center" colspan="5">TensorRT(ms)</th>
<th align="center" colspan="2">PPLNN(ms)</th>
<th align="center" colspan="2">ncnn(ms)</th>
<th align="center" colspan="1">Ascend(ms)</th>
</tr>
</thead>
<tbody>
@ -45,6 +46,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center" colspan="1">T4</td>
<td align="center" colspan="1">SnapDragon888</td>
<td align="center" colspan="1">Adreno660</td>
<td align="center" colspan="1">Ascend310</td>
</tr>
<tr>
<td align="center" colspan="1">fp32</td>
@ -56,6 +58,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center" colspan="1">fp16</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
<td align="center" colspan="1">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet/resnet50_b32x8_imagenet.py"> ResNet </a></td>
@ -69,6 +72,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">1.30</td>
<td align="center">33.91</td>
<td align="center">25.93</td>
<td align="center">2.49</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py"> ResNeXt </a></td>
@ -82,6 +86,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">1.36</td>
<td align="center">133.44</td>
<td align="center">69.38</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"> <a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet/seresnet50_b32x8_imagenet.py"> SE-ResNet </a></td>
@ -95,6 +100,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">1.91</td>
<td align="center">107.84</td>
<td align="center">80.85</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py"> ShuffleNetV2 </a></td>
@ -108,6 +114,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">4.69</td>
<td align="center">9.55</td>
<td align="center">10.66</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
@ -416,6 +423,7 @@ GPU: ncnn, TensorRT, PPLNN
<th align="center">ONNX Runtime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -429,6 +437,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet/resnet18_b32x8_imagenet.py">ResNet-18</a></td>
@ -440,6 +449,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.86</td>
<td align="center">69.91</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -450,6 +460,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">89.33</td>
<td align="center">89.38</td>
<td align="center">89.34</td>
<td align="center">89.43</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py">ResNeXt-50</a></td>
@ -461,6 +472,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">-</td>
<td align="center">77.78</td>
<td align="center">77.89</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -471,6 +483,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">-</td>
<td align="center">93.64</td>
<td align="center">93.65</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext/resnext50_32x4d_b32x8_imagenet.py">SE-ResNet-50</a></td>
@ -482,6 +495,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.75</td>
<td align="center">77.63</td>
<td align="center">77.73</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -492,6 +506,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">93.83</td>
<td align="center">93.72</td>
<td align="center">93.84</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py">ShuffleNetV1 1.0x</a></td>
@ -503,6 +518,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">68.13</td>
<td align="center">67.71</td>
<td align="center">68.11</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -513,6 +529,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">87.81</td>
<td align="center">87.58</td>
<td align="center">87.80</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py">ShuffleNetV2 1.0x</a></td>
@ -524,6 +541,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">69.54</td>
<td align="center">69.10</td>
<td align="center">69.54</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -534,6 +552,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">88.91</td>
<td align="center">88.58</td>
<td align="center">88.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2/mobilenet_v2_b32x8_imagenet.py">MobileNet V2</a></td>
@ -545,6 +564,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">71.87</td>
<td align="center">70.91</td>
<td align="center">71.84</td>
<td align="center">71.87</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -555,6 +575,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">90.40</td>
<td align="center">89.85</td>
<td align="center">90.41</td>
<td align="center">90.42</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmclassification/blob/master/configs/vision_transformer/vit-base-p16_ft-64xb64_in1k-384.py">Vision Transformer</a></td>
@ -566,6 +587,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">85.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">85.43</td>
</tr>
<tr>
<td align="center">top-5</td>
@ -576,6 +598,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">97.76</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">97.77</td>
</tr>
</tbody>
</table>
@ -590,6 +613,7 @@ GPU: ncnn, TensorRT, PPLNN
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -605,6 +629,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo/yolov3_d53_320_273e_coco.py">YOLOV3</a></td>
@ -618,6 +643,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">33.5</td>
<td align="center">33.5</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd/ssd300_coco.py">SSD</a></td>
@ -631,6 +657,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">25.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet/retinanet_r50_fpn_1x_coco.py">RetinaNet</a></td>
@ -644,6 +671,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">36.4</td>
<td align="center">36.3</td>
<td align="center">36.5</td>
<td align="center">36.4</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py">FCOS</a></td>
@ -657,6 +685,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">36.5</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf/fsaf_r50_fpn_1x_coco.py">FSAF</a></td>
@ -670,6 +699,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">37.4</td>
<td align="center">37.2</td>
<td align="center">37.4</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox/yolox_s_8x8_300e_coco.py">YOLOX</a></td>
@ -683,6 +713,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">40.3</td>
<td align="center">29.3</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py">Faster R-CNN</a></td>
@ -696,6 +727,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">37.3</td>
<td align="center">37.1</td>
<td align="center">37.3</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r50_fpn_1x_coco.py">ATSS</a></td>
@ -709,6 +741,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">39.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py">Cascade R-CNN</a></td>
@ -722,6 +755,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">40.4</td>
<td align="center">-</td>
<td align="center">40.4</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl/gfl_r50_fpn_1x_coco.py">GFL</a></td>
@ -735,6 +769,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">40.0</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py">RepPoints</a></td>
@ -748,6 +783,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmdetection/tree/master/configs/detr/detr_r50_8x2_150e_coco.py">DETR</a></td>
@ -774,6 +810,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">38.1</td>
<td align="center">-</td>
<td align="center">38.0</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
@ -784,6 +821,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">33.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center" rowspan="2"><a href="https://github.com/open-mmlab/mmdetection/blob/master/configs/swin/mask_rcnn_swin-t-p4-w7_fpn_1x_coco.py">Swin-Transformer</a></td>
@ -797,6 +835,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">37.7</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center">mask AP</td>
@ -807,6 +846,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">35.4</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>
@ -1192,6 +1232,7 @@ GPU: ncnn, TensorRT, PPLNN
<th align="center">ONNXRuntime</th>
<th align="center" colspan="3">TensorRT</th>
<th align="center">PPLNN</th>
<th align="center">Ascend</th>
</tr>
</thead>
<tbody>
@ -1206,6 +1247,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">fp16</td>
<td align="center">int8</td>
<td align="center">fp16</td>
<td align="center">fp32</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py">FCN</a></td>
@ -1218,6 +1260,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">72.35</td>
<td align="center">74.19</td>
<td align="center">72.35</td>
<td align="center">72.35</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py">PSPNet</a></td>
@ -1230,6 +1273,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">78.24</td>
<td align="center">77.97</td>
<td align="center">78.09</td>
<td align="center">78.67</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3/deeplabv3_r50-d8_512x1024_40k_cityscapes.py">deeplabv3</a></td>
@ -1242,6 +1286,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">79.12</td>
<td align="center">78.96</td>
<td align="center">79.12</td>
<td align="center">79.06</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_40k_cityscapes.py">deeplabv3+</a></td>
@ -1254,6 +1299,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">79.60</td>
<td align="center">79.43</td>
<td align="center">79.60</td>
<td align="center">79.51</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn/fast_scnn_lr0.12_8x4_160k_cityscapes.py">Fast-SCNN</a></td>
@ -1266,6 +1312,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">70.92</td>
<td align="center">66.00</td>
<td align="center">70.92</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet/fcn_unet_s5-d16_4x4_512x1024_160k_cityscapes.py">UNet</a></td>
@ -1278,6 +1325,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">69.10</td>
<td align="center">68.95</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann/ann_r50-d8_512x1024_40k_cityscapes.py">ANN</a></td>
@ -1290,6 +1338,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet/apcnet_r50-d8_512x1024_40k_cityscapes.py">APCNet</a></td>
@ -1302,6 +1351,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.32</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1/bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes.py">BiSeNetV1</a></td>
@ -1314,6 +1364,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">74.43</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes.py">BiSeNetV2</a></td>
@ -1326,6 +1377,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">73.21</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet/cgnet_512x1024_60k_cityscapes.py">CGNet</a></td>
@ -1338,6 +1390,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">68.27</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet/emanet_r50-d8_512x1024_80k_cityscapes.py">EMANet</a></td>
@ -1350,6 +1403,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.6</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet/encnet_r50-d8_512x1024_40k_cityscapes.py">EncNet</a></td>
@ -1362,6 +1416,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">75.66</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet/erfnet_fcn_4x4_512x1024_160k_cityscapes.py">ERFNet</a></td>
@ -1374,6 +1429,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">71.07</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn/fastfcn_r50-d32_jpu_aspp_512x1024_80k_cityscapes.py">FastFCN</a></td>
@ -1386,6 +1442,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">79.12</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet/gcnet_r50-d8_512x1024_40k_cityscapes.py">GCNet</a></td>
@ -1398,6 +1455,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.69</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet/icnet_r18-d8_832x832_80k_cityscapes.py">ICNet</a></td>
@ -1410,6 +1468,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">76.36</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet/isanet_r50-d8_512x1024_40k_cityscapes.py">ISANet</a></td>
@ -1422,6 +1481,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">78.49</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet/ocrnet_hr18s_512x1024_40k_cityscapes.py">OCRNet</a></td>
@ -1434,6 +1494,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">73.67</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend/pointrend_r50_512x1024_80k_cityscapes.py">PointRend</a></td>
@ -1446,6 +1507,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">76.42</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn/fpn_r50_512x1024_80k_cityscapes.py">Semantic FPN</a></td>
@ -1458,6 +1520,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">74.52</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc/stdc1_in1k-pre_512x1024_80k_cityscapes.py">STDC</a></td>
@ -1470,6 +1533,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">75.10</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc/stdc2_in1k-pre_512x1024_80k_cityscapes.py">STDC</a></td>
@ -1482,6 +1546,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.17</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet/upernet_r50_512x1024_40k_cityscapes.py">UPerNet</a></td>
@ -1494,6 +1559,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">77.18</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/open-mmlab/mmsegmentation/blob/master/configs/segmenter/segmenter_vit-s_linear_8x1_512x512_160k_ade20k.py">Segmenter</a></td>
@ -1506,6 +1572,7 @@ GPU: ncnn, TensorRT, PPLNN
<td align="center">43.34</td>
<td align="center">43.35</td>
<td align="center">-</td>
<td align="center">-</td>
</tr>
</tbody>
</table>

View File

@ -2,79 +2,82 @@
自测完成的 model-backend 组合:
| Model | Codebase | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO | Model config |
| :-------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | :---------------------------------------------------------------------------------------------: |
| RetinaNet | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet) |
| Faster R-CNN | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) |
| YOLOv3 | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo) |
| YOLOX | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox) |
| FCOS | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos) |
| FSAF | MMDetection | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf) |
| Mask R-CNN | MMDetection | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn) |
| SSD[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd) |
| FoveaBox | MMDetection | Y | Y | N | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/foveabox) |
| ATSS | MMDetection | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss) |
| GFL | MMDetection | N | Y | Y | N | ? | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl) |
| Cascade R-CNN | MMDetection | N | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Cascade Mask R-CNN | MMDetection | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Swin Transformer[\*](#note) | MMDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/swin) |
| VFNet | MMDetection | N | N | N | N | N | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/vfnet) |
| RepPoints | MMDetection | N | N | Y | N | ? | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints) |
| DETR | MMDetection | N | Y | Y | N | ? | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/detr) |
| ResNet | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet) |
| ResNeXt | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext) |
| SE-ResNet | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet) |
| MobileNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2) |
| ShuffleNetV1 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1) |
| ShuffleNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2) |
| VisionTransformer | MMClassification | Y | Y | Y | Y | ? | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/vision_transformer) |
| SwinTransformer | MMClassification | Y | Y | Y | N | ? | N | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/swin_transformer) |
| FCN | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn) |
| PSPNet[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet) |
| DeepLabV3 | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3) |
| DeepLabV3+ | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus) |
| Fast-SCNN[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn) |
| UNet | MMSegmentation | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet) |
| ANN[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann) |
| APCNet | MMSegmentation | Y | Y | Y | Y | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet) |
| BiSeNetV1 | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1) |
| BiSeNetV2 | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2) |
| CGNet | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet) |
| DMNet | MMSegmentation | ? | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dmnet) |
| DNLNet | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dnlnet) |
| EMANet | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet) |
| EncNet | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet) |
| ERFNet | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet) |
| FastFCN | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn) |
| GCNet | MMSegmentation | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet) |
| ICNet[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet) |
| ISANet[\*static](#note) | MMSegmentation | N | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet) |
| NonLocal Net | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/nonlocal_net) |
| OCRNet | MMSegmentation | ? | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet) |
| PointRend | MMSegmentation | Y | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend) |
| Semantic FPN | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn) |
| STDC | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc) |
| UPerNet[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet) |
| DANet | MMSegmentation | ? | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/danet) |
| Segmenter[\*static](#note) | MMSegmentation | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/segmenter) |
| SRCNN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srcnn) |
| ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/esrgan) |
| SRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| SRResNet | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| Real-ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/real_esrgan) |
| EDSR | MMEditing | Y | Y | Y | Y | N | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/edsr) |
| RDN | MMEditing | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/rdn) |
| DBNet | MMOCR | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnet) |
| CRNN | MMOCR | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/crnn) |
| SAR | MMOCR | N | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/sar) |
| HRNet | MMPose | N | Y | Y | Y | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#hrnet-cvpr-2019) |
| MSPN | MMPose | N | Y | Y | Y | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#mspn-arxiv-2019) |
| LiteHRNet | MMPose | N | Y | Y | N | N | Y | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#litehrnet-cvpr-2021) |
| PointPillars | MMDetection3d | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
| CenterPoint (pillar) | MMDetection3d | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
| RotatedRetinaNet | RotatedDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) |
| Oriented RCNN | RotatedDetection | N | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) |
| Gliding Vertex | RotatedDetection | N | N | Y | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) |
| Model | Codebase | TorchScript | OnnxRuntime | TensorRT | ncnn | PPLNN | OpenVINO | Ascend | Model config |
| :-------------------------- | :--------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | :----: | :---------------------------------------------------------------------------------------------: |
| RetinaNet | MMDetection | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/retinanet) |
| Faster R-CNN | MMDetection | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn) |
| YOLOv3 | MMDetection | Y | Y | Y | Y | N | Y | Y | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo) |
| YOLOX | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox) |
| FCOS | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fcos) |
| FSAF | MMDetection | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/fsaf) |
| Mask R-CNN | MMDetection | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/mask_rcnn) |
| SSD[\*](#note) | MMDetection | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/ssd) |
| FoveaBox | MMDetection | Y | Y | N | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/foveabox) |
| ATSS | MMDetection | N | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss) |
| GFL | MMDetection | N | Y | Y | N | ? | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl) |
| Cascade R-CNN | MMDetection | N | Y | Y | N | Y | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Cascade Mask R-CNN | MMDetection | N | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn) |
| Swin Transformer[\*](#note) | MMDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/swin) |
| VFNet | MMDetection | N | N | N | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/vfnet) |
| RepPoints | MMDetection | N | N | Y | N | ? | Y | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/reppoints) |
| DETR | MMDetection | N | Y | Y | N | ? | N | N | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/detr) |
| ResNet | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet) |
| ResNeXt | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext) |
| SE-ResNet | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet) |
| MobileNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2) |
| ShuffleNetV1 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1) |
| ShuffleNetV2 | MMClassification | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2) |
| VisionTransformer | MMClassification | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/vision_transformer) |
| SwinTransformer | MMClassification | Y | Y | Y | N | ? | N | ? | [config](https://github.com/open-mmlab/mmclassification/tree/master/configs/swin_transformer) |
| FCN | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fcn) |
| PSPNet[\*static](#note) | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/pspnet) |
| DeepLabV3 | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3) |
| DeepLabV3+ | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/deeplabv3plus) |
| Fast-SCNN[\*static](#note) | MMSegmentation | Y | Y | Y | N | Y | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastscnn) |
| UNet | MMSegmentation | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/unet) |
| ANN[\*](#note) | MMSegmentation | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ann) |
| APCNet | MMSegmentation | Y | Y | Y | Y | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/apcnet) |
| BiSeNetV1 | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv1) |
| BiSeNetV2 | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/bisenetv2) |
| CGNet | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/cgnet) |
| DMNet | MMSegmentation | ? | Y | N | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dmnet) |
| DNLNet | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/dnlnet) |
| EMANet | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/emanet) |
| EncNet | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/encnet) |
| ERFNet | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/erfnet) |
| FastFCN | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/fastfcn) |
| GCNet | MMSegmentation | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/gcnet) |
| ICNet[\*](#note) | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/icnet) |
| ISANet | MMSegmentation | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/isanet) |
| NonLocal Net | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/nonlocal_net) |
| OCRNet | MMSegmentation | ? | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/ocrnet) |
| PointRend | MMSegmentation | Y | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/point_rend) |
| Semantic FPN | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/sem_fpn) |
| STDC | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/stdc) |
| UPerNet[\*](#note) | MMSegmentation | ? | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/upernet) |
| DANet | MMSegmentation | ? | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/danet) |
| Segmenter | MMSegmentation | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmsegmentation/tree/master/configs/segmenter) |
| SRCNN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srcnn) |
| ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/esrgan) |
| SRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| SRResNet | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/srresnet_srgan) |
| Real-ESRGAN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/real_esrgan) |
| EDSR | MMEditing | Y | Y | Y | Y | N | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/edsr) |
| RDN | MMEditing | Y | Y | Y | Y | Y | Y | N | [config](https://github.com/open-mmlab/mmediting/tree/master/configs/restorers/rdn) |
| DBNet | MMOCR | Y | Y | Y | Y | Y | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/dbnet) |
| PANet | MMOCR | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/panet) |
| PSENet | MMOCR | Y | Y | Y | Y | ? | Y | Y | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textdet/psenet) |
| CRNN | MMOCR | Y | Y | Y | Y | Y | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/crnn) |
| SAR | MMOCR | N | Y | N | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/sar) |
| SATRN | MMOCR | Y | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmocr/tree/main/configs/textrecog/satrn) |
| HRNet | MMPose | N | Y | Y | Y | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#hrnet-cvpr-2019) |
| MSPN | MMPose | N | Y | Y | Y | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#mspn-arxiv-2019) |
| LiteHRNet | MMPose | N | Y | Y | N | N | Y | N | [config](https://mmpose.readthedocs.io/en/latest/papers/backbones.html#litehrnet-cvpr-2021) |
| PointPillars | MMDetection3d | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
| CenterPoint (pillar) | MMDetection3d | ? | Y | Y | N | N | Y | N | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
| RotatedRetinaNet | RotatedDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_retinanet/README.md) |
| Oriented RCNN | RotatedDetection | N | Y | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/oriented_rcnn/README.md) |
| Gliding Vertex | RotatedDetection | N | N | Y | N | N | N | N | [config](https://github.com/open-mmlab/mmrotate/blob/main/configs/gliding_vertex/README.md) |
## Note
@ -82,4 +85,5 @@
- static: This model only support static export. Please use `static` deploy config, just like $MMDEPLOY_DIR/configs/mmseg/segmentation_tensorrt_static-1024x2048.py.
- SSD: When you convert SSD model, you need to use min shape deploy config just like 300x300-512x512 rather than 320x320-1344x1344, for example $MMDEPLOY_DIR/configs/mmdet/detection/detection_tensorrt_dynamic-300x300-512x512.py.
- YOLOX: YOLOX with ncnn only supports static shape.
- Swin Transformer: For TensorRT, only version 8.4+ is supported.
- SAR: Chinese text recognition model is not supported as the protobuf size of ONNX is limited.

View File

@ -0,0 +1,11 @@
# Copyright (c) OpenMMLab. All rights reserved.
from mmdeploy.backend.ascend import is_available
__all__ = ['is_available']
if is_available():
from mmdeploy.backend.ascend.onnx2ascend import from_onnx as _from_onnx
from ..core import PIPELINE_MANAGER
from_onnx = PIPELINE_MANAGER.register_pipeline()(_from_onnx)
__all__ += ['from_onnx']

View File

@ -0,0 +1,20 @@
# Copyright (c) OpenMMLab. All rights reserved.
import importlib
from .utils import update_sdk_pipeline
def is_available():
"""Check whether acl is installed.
Returns:
bool: True if acl package is installed.
"""
return importlib.util.find_spec('acl') is not None
__all__ = ['update_sdk_pipeline']
if is_available():
from .wrapper import AscendWrapper, Error
__all__ += ['AscendWrapper', 'Error']

View File

@ -0,0 +1,81 @@
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
import tempfile
from subprocess import call
from typing import Dict, Sequence, Union
import onnx
from mmdeploy.utils import get_root_logger
def make_shape_string(name, dims):
return f'{name}:{",".join(map(str, dims))}'
def _concat(dims: Sequence) -> str:
return ';'.join([','.join(map(str, x)) for x in dims])
def from_onnx(onnx_model: Union[onnx.ModelProto, str], work_dir: str,
model_inputs: Dict):
"""Convert ONNX to Ascend model.
Example:
>>> from mmdeploy.apis.ascend import from_onnx
>>> onnx_path = 'work_dir/end2end.onnx'
>>> model_inputs = mmcv.Config(
>>> dict(input_shapes=dict(input=[1, 3, 224, 224])))
>>> from_onnx(onnx_path, work_dir, model_inputs)
Args:
onnx_path (ModelProto|str): The path of the onnx model.
work_dir (str): Path to load onnx and save model.
model_inputs (Dict): The input args to the atc tools.
"""
logger = get_root_logger()
if not isinstance(onnx_model, str):
onnx_path = tempfile.NamedTemporaryFile(suffix='.onnx').name
onnx.save(onnx_model, onnx_path)
else:
onnx_path = onnx_model
onnx_model = onnx.load(onnx_path)
for n in onnx_model.graph.node:
if n.domain != '':
n.domain = ''
for i in range(1, len(onnx_model.opset_import)):
onnx_model.opset_import.pop(i)
onnx.save(onnx_model, onnx_path)
output_path = osp.join(work_dir, osp.splitext(osp.split(onnx_path)[1])[0])
input_shapes = []
for name, dims in model_inputs['input_shapes'].items():
input_shapes.append(make_shape_string(name, dims))
input_shapes = ';'.join(input_shapes)
input_format = 'ND' if 'dynamic_dims' in model_inputs else 'NCHW'
args = [
f'--model={onnx_path}', '--framework=5', f'--output={output_path}',
'--soc_version=Ascend310', f'--input_format={input_format}',
f'--input_shape={input_shapes}'
]
if 'dynamic_batch_size' in model_inputs:
dynamic_batch_size = ','.join(
map(str, model_inputs['dynamic_batch_size']))
args.append(f'--dynamic_batch_size={dynamic_batch_size}')
elif 'dynamic_image_size' in model_inputs:
dynamic_image_size = _concat(model_inputs['dynamic_image_size'])
args.append(f'--dynamic_image_size={dynamic_image_size}')
elif 'dynamic_dims' in model_inputs:
dynamic_dims = _concat(model_inputs['dynamic_dims'])
args.append(f'--dynamic_dims={dynamic_dims}')
logger.info(' '.join(('atc', *args)))
ret_code = call(['atc', *args])
assert ret_code == 0

View File

@ -0,0 +1,48 @@
# Copyright (c) OpenMMLab. All rights reserved.
import math
import os.path as osp
from mmdeploy.utils import get_root_logger
def update_sdk_pipeline(work_dir: str):
"""Update pipeline.json for Ascend.
Args:
work_dir (str):The work directory to load/save the pipeline.json
"""
logger = get_root_logger()
def _try_ori_agnostic_pad(transforms):
trans_resize = None
trans_pad = None
for trans in transforms:
if trans['type'] == 'Resize' and trans.get('keep_ratio', False):
trans_resize = trans
elif trans['type'] == 'Pad' and trans.get('size_divisor',
None) is not None:
trans_pad = trans
if trans_resize is not None and trans_pad is not None:
logger.info('update Pad transform.')
size = trans_resize['size']
divisor = trans_pad['size_divisor']
size = tuple(int(math.ceil(s / divisor) * divisor) for s in size)
trans_pad['size'] = size
trans_pad['orientation_agnostic'] = True
trans_pad.pop('size_divisor')
pipeline_path = osp.join(work_dir, 'pipeline.json')
if osp.exists(pipeline_path):
import mmcv
pipeline = mmcv.load(pipeline_path)
tasks = pipeline['pipeline'].get('tasks', [])
for task in tasks:
if task.get('module', '') == 'Transform':
transforms = task['transforms']
_try_ori_agnostic_pad(transforms)
mmcv.dump(pipeline, pipeline_path, sort_keys=False, indent=4)

View File

@ -0,0 +1,593 @@
# Copyright (c) OpenMMLab. All rights reserved.
import os
from contextlib import contextmanager
from typing import Dict, List, NamedTuple, Sequence
import acl
import numpy as np
import torch
from mmdeploy.utils import Backend
from mmdeploy.utils.timer import TimeCounter
from ..base import BACKEND_WRAPPER, BaseWrapper
_from_acl_data_type = {0: torch.float32, 3: torch.int32, 9: torch.int64}
ACL_MEMCPY_HOST_TO_HOST = 0
ACL_MEMCPY_HOST_TO_DEVICE = 1
ACL_MEMCPY_DEVICE_TO_HOST = 2
ACL_MEMCPY_DEVICE_TO_DEVICE = 3
class Error(Exception):
"""Acl Exception."""
pass
def _check(code: int, msg: str):
"""check the error code.
Args:
code (int): The error code.
msg (str): Error message.
"""
if code != 0:
raise Error(msg, code)
class DataBuffer:
"""The acl data buffer.
Args:
size (int): Buffer size.
"""
def __init__(self, size: int):
data, ret = acl.rt.malloc(size, 0)
_check(ret, 'acl.rt.malloc')
self.data = data
self.size = size
self.handle = acl.create_data_buffer(data, size)
def destroy(self):
if self.handle is not None:
acl.destroy_data_buffer(self.handle)
acl.rt.free(self.data)
self.handle = None
def __del__(self):
self.destroy()
class Dataset:
"""The acl dataset."""
def __init__(self):
self.handle = acl.mdl.create_dataset()
self.buffers = []
def destroy(self):
if self.handle is not None:
for buffer in self.buffers:
buffer.destroy()
acl.mdl.destroy_dataset(self.handle)
self.handle = None
def __del__(self):
self.destroy()
def add_buffer(self, buffer: DataBuffer):
"""Add data buffer into the dataset.
Args:
buffer (DataBuffer): The DataBuffer instance.
"""
self.buffers.append(buffer)
_, ret = acl.mdl.add_dataset_buffer(self.handle, buffer.handle)
_check(ret, 'acl.mdl.add_dataset_buffer')
class Binding(NamedTuple):
index: int
name: str
dims: List[int]
data_type: np.dtype
size: int
class ModelDesc:
"""The model description wrapper.
Args:
model_id (int): The id of the model, created by acl tools.
"""
def __init__(self, model_id):
self._desc = acl.mdl.create_desc()
ret = acl.mdl.get_desc(self._desc, model_id)
_check(ret, 'acl.mdl.get_desc')
self.inputs = []
self.dynamic_tensor = None
num_inputs = acl.mdl.get_num_inputs(self._desc)
for index in range(num_inputs):
dims = self._get_input_dims(index)
data_type = acl.mdl.get_input_data_type(self._desc, index)
data_type = _from_acl_data_type[data_type]
size = acl.mdl.get_input_size_by_index(self._desc, index)
binding = Binding(index, dims['name'], dims['dims'], data_type,
size)
if dims['name'] == 'ascend_mbatch_shape_data':
self.dynamic_tensor = binding
else:
self.inputs.append(binding)
self.outputs = []
num_outputs = acl.mdl.get_num_outputs(self._desc)
for index in range(num_outputs):
dims = self._get_output_dims(index)
data_type = acl.mdl.get_output_data_type(self._desc, index)
data_type = _from_acl_data_type[data_type]
size = acl.mdl.get_output_size_by_index(self._desc, index)
self.outputs.append(
Binding(index, dims['name'], dims['dims'], data_type, size))
def destroy(self):
if self._desc is not None:
acl.mdl.destroy_desc(self._desc)
self._desc = None
def __del__(self):
self.destroy()
def _get_input_dims(self, index: int):
"""Get the dimension of the input by index.
Args:
index (int): The index of the input.
"""
dims, ret = acl.mdl.get_input_dims(self._desc, index)
_check(ret, 'acl.mdl.get_input_dims')
return dims
def _get_output_dims(self, index: int):
"""Get the dimension of the output by index.
Args:
index (int): The index of the output.
"""
dims, ret = acl.mdl.get_output_dims(self._desc, index)
_check(ret, 'acl.mdl.get_output_dims')
dims['name'] = dims['name'].split(':')[-1]
return dims
def _get_current_output_dims(self, index: int):
"""Get the dimension of current output implementation.
Args:
index (int): The index of the output.
"""
dims, ret = acl.mdl.get_cur_output_dims(self._desc, index)
_check(ret, 'acl.mdl.get_cur_output_dims')
return dims
def get_current_ouptut_dims(self):
"""Get the dimension of current output."""
dimses = []
for output in self.outputs:
dims = self._get_current_output_dims(output.index)
dimses.append(dims['dims'])
return dimses
def _get_input_index(self, name: str) -> int:
"""Get input index by name.
Args:
name (str): The name of the input.
Returns:
(int): The input index.
"""
index, ret = acl.mdl.get_input_index_by_name(self._desc, name)
return index if ret == 0 else -1
def get_dynamic_batch(self) -> Sequence:
"""Get dynamic batch size list.
Returns:
(Sequence): The dynamic batch list.
"""
batch, ret = acl.mdl.get_dynamic_batch(self._desc)
_check(ret, 'acl.mdl.get_dynamic_batch')
batch = batch['batch']
return sorted(batch)
def get_dynamic_hw(self) -> Sequence:
"""Get dynamic height and width size list.
Returns:
(Sequence): The dynamic height and width
"""
hw_info, ret = acl.mdl.get_dynamic_hw(self._desc, -1)
_check(ret, 'acl.mdl.get_dynamic_hw')
return hw_info['hw']
def get_input_dynamic_dims(self) -> Sequence:
"""Get dynamic dims.
Returns:
(Sequence): The dynamic dims
"""
count, ret = acl.mdl.get_input_dynamic_gear_count(self._desc, -1)
_check(ret, 'acl.mdl.get_input_dynamic_gear_count')
dims, ret = acl.mdl.get_input_dynamic_dims(self._desc, -1, count)
_check(ret, 'acl.mdl.get_input_dynamic_dims')
return dims
class Context:
ref_count = 0
owned_acl = False
def __init__(self):
if not _is_torch_npu_available:
self._active = True
if Context.ref_count == 0:
ret = acl.init()
if ret == 0:
Context.owned_acl = True
elif ret == 100002: # ACL_ERROR_REPEAT_INITIALIZE
pass
else:
_check(ret, 'acl.init')
Context.ref_count += 1
else:
self._active = False
def __del__(self):
self.destroy()
def destroy(self):
if not self._active:
return
Context.ref_count -= 1
if Context.ref_count == 0 and Context.owned_acl:
ret = acl.finalize()
if ret == 0:
Context.owned_acl = False
elif ret == 100037: # ACL_ERROR_REPEAT_FINALIZE
pass
else:
_check(ret, 'acl.finalize')
self._active = False
_is_torch_npu_available = False
if os.environ.get('MMDEPLOY_USE_TORCH_NPU'):
try:
import torch_npu
_is_torch_npu_available = True
except Exception:
print('import torch_npu failed, torch_npu is disabled')
class Device:
def __init__(self, device: str):
if _is_torch_npu_available:
self._torch_device = torch.device(device)
self.index = self._torch_device.index
# force torch_npu to initialize
with torch_npu.npu.device(self.index):
pass
else:
self._torch_device = torch.device('cpu')
name_idx = device.split(':')
self.index = 0 if len(name_idx) == 1 else int(name_idx[-1])
@contextmanager
def __call__(self):
# torch_npu.npu.device() leads to segfault when index > 0
_check(acl.rt.set_device(self.index), 'acl.rt.set_device')
try:
yield
finally:
pass
@BACKEND_WRAPPER.register_module(Backend.ASCEND.value)
class AscendWrapper(BaseWrapper):
"""Ascend wrapper class for inference.
Args:
model (str): Path of the model file.
Examples:
>>> from mmdeploy.backend.ascend import AscendWrapper
>>> import torch
>>>
>>> model_file = 'model.om'
>>> model = AscendWrapper(model_file)
>>> inputs = dict(input=torch.randn(1, 3, 224, 224))
>>> outputs = model(inputs)
"""
def __init__(self, model: str, device: str = 'npu'):
self._context = Context()
self._device = Device(device)
with self._device():
self._model_id, ret = acl.mdl.load_from_file(model)
_check(ret, 'acl.mdl.load_from_file')
self._model_desc = ModelDesc(self._model_id)
self._config_dynamic_shapes()
self._create_input_buffers()
self._create_output_buffers()
output_names = [output.name for output in self._model_desc.outputs]
super().__init__(output_names)
def destroy(self):
if self._model_id is None:
return
with self._device():
self._input.destroy()
self._output.destroy()
self._model_desc.destroy()
acl.mdl.unload(self._model_id)
self._model_id = None
self._context.destroy()
def __del__(self):
self.destroy()
def forward(self, inputs: Dict[str,
torch.Tensor]) -> Dict[str, torch.Tensor]:
"""Run forward inference.
Args:
inputs (Dict[str, torch.Tensor]): Key-value pairs of model inputs.
Returns:
Dict[str, torch.Tensor]: Key-value pairs of model outputs.
"""
with self._device():
input_shapes = [
inputs[x.name].shape for x in self._model_desc.inputs
]
output_shapes = self._reshape(input_shapes)
self._synchronize_torch_stream()
torch_device = self._device._torch_device
for binding in self._model_desc.inputs:
tensor = inputs[binding.name].to(
torch_device, dtype=binding.data_type).contiguous()
self._copy_tensor_to_buffer(tensor,
self._input.buffers[binding.index])
outputs = {}
for binding in self._model_desc.outputs:
shape = output_shapes[binding.index]
tensor = torch.empty(
shape, dtype=binding.data_type, device=torch_device)
if torch_device.type == 'npu':
ret = acl.update_data_buffer(
self._output.buffers[binding.index].handle,
tensor.data_ptr(),
tensor.element_size() * tensor.numel())
_check(ret, 'acl.update_data_buffer')
outputs[binding.name] = tensor
self.__ascend_execute()
for binding in self._model_desc.outputs:
self._copy_buffer_to_tensor(
self._output.buffers[binding.index], tensor)
return outputs
def _copy_tensor_to_buffer(self, tensor: torch.Tensor, buffer: DataBuffer):
if tensor.device.type == 'cpu':
kind = ACL_MEMCPY_HOST_TO_DEVICE
ret = acl.rt.memcpy(buffer.data, buffer.size, tensor.data_ptr(),
tensor.element_size() * tensor.numel(), kind)
_check(ret, 'acl.rt.memcpy')
else:
ret = acl.update_data_buffer(
buffer.handle, tensor.data_ptr(),
tensor.element_size() * tensor.numel())
_check(ret, 'acl.update_data_buffer')
def _copy_buffer_to_tensor(self, buffer: DataBuffer, tensor: torch.Tensor):
if tensor.device.type == 'cpu':
kind = ACL_MEMCPY_DEVICE_TO_HOST
size = tensor.element_size() * tensor.numel()
ret = acl.rt.memcpy(tensor.data_ptr(), size, buffer.data, size,
kind)
_check(ret, 'acl.rt.memcpy')
def _verify_dims(self, src: Sequence[int], ref: Sequence[int]):
"""Check if src match ref."""
if len(src) != len(ref):
raise RuntimeError(f'Shape mismatch {src} vs {ref}')
for src_dim, ref_dim in zip(src, ref):
if ref_dim != -1 and src_dim != ref_dim:
raise RuntimeError(f'Shape mismatch {src} vs {ref}')
def _reshape(self, input_shapes: Sequence[Sequence[int]]):
"""Reshape the inputs.
Args:
input_shapes (Sequence[Sequence[int]]): The shapes used to
do reshape
"""
if len(input_shapes) != len(self._model_desc.inputs):
raise RuntimeError('#inputs mismatch')
for src, ref in zip(input_shapes, self._model_desc.inputs):
self._verify_dims(src, ref.dims)
self._reshape_fn(input_shapes)
dimses = self._model_desc.get_current_ouptut_dims()
return dimses
def _reshape_static(self, input_shapes):
"""Do nothing.
Args:
input_shapes (Sequence[Sequence[int]]): Not used.
"""
pass
def _reshape_dynamic_batch_size(self,
input_shapes: Sequence[Sequence[int]]):
"""Reshape for dynamic batch size.
Args:
input_shapes (Sequence[Sequence[int]]): The shapes used to
do reshape
"""
batch_size = None
for src, ref in zip(input_shapes, self._model_desc.inputs):
if ref.dims[0] == -1:
if batch_size is None:
batch_size = src[0]
elif batch_size != src[0]:
raise RuntimeError(
f'Inconsistent batch size {batch_size} vs {src[0]}')
if batch_size is None:
raise RuntimeError('Can\'t determine batch size')
candidates = list(
filter(lambda x: x >= batch_size, self._dynamic_batch_size))
if not candidates:
raise RuntimeError(f'Batch size {batch_size} is not supported.'
f' ({self._dynamic_batch_size})')
ret = acl.mdl.set_dynamic_batch_size(
self._model_id, self._input.handle,
self._model_desc.dynamic_tensor.index, candidates[0])
_check(ret, 'acl.mdl.set_dynamic_batch_size')
def _reshape_dynamic_image_size(self,
input_shapes: Sequence[Sequence[int]]):
"""Reshape for dynamic image size.
Args:
input_shapes (Sequence[Sequence[int]]): The shapes used to
do reshape
"""
size = None
for src, ref in zip(input_shapes, self._model_desc.inputs):
if -1 in ref.dims:
tmp_size = src[-2], src[-1]
if size is None:
size = tmp_size
elif size != tmp_size:
raise RuntimeError(
f'Inconsistent image size {size} vs {tmp_size}')
if size is None:
raise RuntimeError('Can\'t determine dynamic HW')
if not list(size) in self._dynamic_hw:
raise RuntimeError(
f'size {size} is not supported. ({self._dynamic_hw})')
height, width = size
ret = acl.mdl.set_dynamic_hw_size(
self._model_id, self._input.handle,
self._model_desc.dynamic_tensor.index, height, width)
_check(ret, 'acl.mdl.set_dynamic_hw_size')
def _reshape_dynamic_dims(self, input_shapes: Sequence[Sequence[int]]):
"""Reshape for dynamic dims.
Args:
input_shapes (Sequence[Sequence[int]]): The shapes used to
do reshape
"""
match = [True] * len(self._dynamic_dims)
ptr = 0
for src in input_shapes:
for axis, src_dim in enumerate(src):
for index, dims in enumerate(self._dynamic_dims):
ref_dim = dims['dims'][ptr]
# allow batch dimension to vary
if axis == 0 and src_dim < ref_dim:
pass
elif src_dim != ref_dim:
match[index] = False
ptr += 1
indices = [i for i, v in enumerate(match) if v]
if not indices:
raise RuntimeError('No matching profile found')
index = indices[0]
ret = acl.mdl.set_input_dynamic_dims(
self._model_id, self._input.handle,
self._model_desc.dynamic_tensor.index, self._dynamic_dims[index])
_check(ret, 'acl.mdl.set_input_dynamic_dims')
def _config_dynamic_shapes(self):
"""Set the reshape function."""
if self._model_desc.dynamic_tensor is None:
self._reshape_fn = self._reshape_static
return
self._dynamic_batch_size = self._model_desc.get_dynamic_batch()
if self._dynamic_batch_size:
self._reshape_fn = self._reshape_dynamic_batch_size
return
self._dynamic_dims = self._model_desc.get_input_dynamic_dims()
if self._dynamic_dims:
self._reshape_fn = self._reshape_dynamic_dims
return
self._dynamic_hw = self._model_desc.get_dynamic_hw()
if self._dynamic_hw:
self._reshape_fn = self._reshape_dynamic_image_size
return
raise RuntimeError('Can\'t infer input shape type')
def _create_input_buffers(self):
"""Create buffers for inputs."""
self._input = Dataset()
for binding in self._model_desc.inputs:
self._input.add_buffer(DataBuffer(binding.size))
if self._model_desc.dynamic_tensor:
self._input.add_buffer(
DataBuffer(self._model_desc.dynamic_tensor.size))
def _create_output_buffers(self):
"""Create buffers for outputs."""
self._output = Dataset()
for binding in self._model_desc.outputs:
self._output.add_buffer(DataBuffer(binding.size))
def _synchronize_torch_stream(self):
if _is_torch_npu_available:
torch.npu.current_stream(self._device._torch_device).synchronize()
@TimeCounter.count_time('ascend')
def __ascend_execute(self):
"""Run inference on Ascend."""
ret = acl.mdl.execute(self._model_id, self._input.handle,
self._output.handle)
_check(ret, 'acl.mdl.execute')

View File

@ -130,6 +130,8 @@ def get_models(deploy_cfg: Union[str, mmcv.Config],
weights = replace_suffix(ir_name, '.bin')
if 'precision' in deploy_cfg['backend_config']:
precision = deploy_cfg['backend_config']['precision']
elif backend == Backend.ASCEND:
net = replace_suffix(ir_name, '.om')
elif backend == Backend.SNPE:
net = replace_suffix(ir_name, '.dlc')
elif backend in [Backend.ONNXRUNTIME, Backend.TORCHSCRIPT]:

View File

@ -106,6 +106,9 @@ class BaseBackendModel(torch.nn.Module, metaclass=ABCMeta):
model=backend_files[0],
input_names=input_names,
output_names=output_names)
elif backend == Backend.ASCEND:
from mmdeploy.backend.ascend import AscendWrapper
return AscendWrapper(model=backend_files[0], device=device)
elif backend == Backend.SNPE:
from mmdeploy.backend.snpe import SNPEWrapper
uri = None
@ -116,6 +119,10 @@ class BaseBackendModel(torch.nn.Module, metaclass=ABCMeta):
else:
raise NotImplementedError(f'Unknown backend type: {backend.value}')
def destroy(self):
if hasattr(self, 'wrapper') and hasattr(self.wrapper, 'destroy'):
self.wrapper.destroy()
@abstractmethod
def forward(self, *args, **kwargs):
"""The forward interface that must be implemented.

View File

@ -324,3 +324,90 @@ def multiclass_nms__torchscript(ctx,
scores, boxes, keeps, batch_size, keep_top_k=keep_top_k)
return dets, labels
class AscendBatchNMSOp(torch.autograd.Function):
@staticmethod
def forward(ctx, bboxes: torch.Tensor, scores: torch.Tensor,
score_threshold: float, iou_threshold: float,
max_size_per_class: int, max_total_size: int):
"""Dummy nms forward
Args:
boxes (torch.Tensor): boxes in shape (batch, N, C, 4).
scores (torch.Tensor): scores in shape (batch, N, C).
score_threshold (float): the score threshold.
iou_threshold (float): the iou threshold.
max_size_per_class (int): max size per class.
max_total_size (int): max total size.
Returns:
(torch.Tensor): boxes,(1, N, 4)
(torch.Tensor): scores,(1, N)
(torch.Tensor): classes,(1, N)
(torch.Tensor): num_dets,(1,)
"""
# Python implementation for onnx export
nmsed_boxes = bboxes[:, :max_total_size, 0, :]
nmsed_scores = scores[:, :max_total_size, 0]
nmsed_classes = torch.arange(max_total_size, dtype=torch.long)
nmsed_num = torch.Tensor([max_total_size])
return nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_num
@staticmethod
def symbolic(g, bboxes, scores, score_thr, iou_thr, max_size_p_class,
max_t_size):
nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_num = g.op(
'mmdeploy::BatchMultiClassNMS',
bboxes,
scores,
score_threshold_f=score_thr,
iou_threshold_f=iou_thr,
max_size_per_class_i=max_size_p_class,
max_total_size_i=max_t_size,
outputs=4)
return nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_num
@FUNCTION_REWRITER.register_rewriter(
func_name='mmdeploy.codebase.mmdet.core.post_processing._multiclass_nms',
backend='ascend')
def multiclass_nms__ascend(ctx,
boxes: Tensor,
scores: Tensor,
max_output_boxes_per_class: int = 1000,
iou_threshold: float = 0.5,
score_threshold: float = 0.05,
pre_top_k: int = -1,
keep_top_k: int = -1):
"""Wrapper for `multiclass_nms` with Ascend.
Args:
ctx (ContextCaller): The context with additional information.
boxes (Tensor): The bounding boxes of shape [N, num_boxes, 4].
scores (Tensor): The detection scores of shape
[N, num_boxes, num_classes].
max_output_boxes_per_class (int): Maximum number of output
boxes per class of nms. Defaults to 1000.
iou_threshold (float): IOU threshold of nms. Defaults to 0.5.
score_threshold (float): score threshold of nms.
Defaults to 0.05.
pre_top_k (int): Number of top K boxes to keep before nms.
Defaults to -1.
keep_top_k (int): Number of top K boxes to keep after nms.
Defaults to -1.
Returns:
tuple[Tensor, Tensor]: (dets, labels), `dets` of shape [N, num_det, 5]
and `labels` of shape [N, num_det].
"""
boxes = boxes if boxes.dim() == 4 else boxes.unsqueeze(2)
keep_top_k = max_output_boxes_per_class if keep_top_k < 0 else min(
max_output_boxes_per_class, keep_top_k)
nmsed_boxes, nmsed_scores, nmsed_classes, _ = AscendBatchNMSOp.apply(
boxes, scores, score_threshold, iou_threshold, keep_top_k, keep_top_k)
dets = torch.cat([nmsed_boxes, nmsed_scores.unsqueeze(2)], dim=-1)
return dets, nmsed_classes

View File

@ -108,7 +108,8 @@ def yolov3_head__get_bboxes(ctx,
batch_inds = torch.arange(
batch_size, device=device).unsqueeze(-1).long()
# Avoid onnx2tensorrt issue in https://github.com/NVIDIA/TensorRT/issues/1134 # noqa: E501
transformed_inds = (bbox_pred.shape[1] * batch_inds + topk_inds)
transformed_inds = (
bbox_pred.shape[1] * batch_inds + topk_inds.long())
bbox_pred = bbox_pred.reshape(-1, 4)[transformed_inds, :].reshape(
batch_size, -1, 4)
cls_pred = cls_pred.reshape(

View File

@ -100,6 +100,94 @@ def single_roi_extractor__forward__tensorrt(ctx,
finest_scale, featmap_strides, aligned)
class AscendRoiExtractor(Function):
"""Create AscendRoiExtractor op.
This class is used to create a AscendRoiExtractor in ONNX for the Ascend
backend.
"""
@staticmethod
def symbolic(g, *args):
"""Symbolic function for creating onnx op."""
aligned = args[-1]
featmap_strides = [1 / stride for stride in args[-2]]
finest_scale = args[-3]
roi_scale_factor = args[-4]
sampling_ratio = args[-5]
pool_mode = args[-6]
output_size = args[-7]
inputs = args[:len(featmap_strides)]
rois = args[len(featmap_strides)]
return g.op(
'mmdeploy::RoiExtractor',
*inputs,
rois,
pooled_height_i=output_size[1],
pooled_width_i=output_size[0],
pool_mode_s=pool_mode,
sample_num_i=sampling_ratio,
roi_scale_factor_f=roi_scale_factor,
finest_scale_i=finest_scale,
spatial_scale_f=featmap_strides,
aligned_i=aligned,
outputs=1)
@staticmethod
def forward(ctx, *args):
"""Run forward."""
# aligned = args[-1]
featmap_strides = args[-2]
# finest_scale = args[-3]
# roi_scale_factor = args[-4]
# sampling_ratio = args[-5]
output_size = args[-7]
inputs = args[:len(featmap_strides)]
rois = args[len(featmap_strides)]
num_proposals = rois.shape[0]
channel = inputs[0].shape[1]
return rois.new_zeros(
(num_proposals, channel, output_size[1], output_size[0]))
@FUNCTION_REWRITER.register_rewriter(
'mmdet.models.roi_heads.roi_extractors.'
'single_level_roi_extractor.SingleRoIExtractor.forward',
backend='ascend')
def single_roi_extractor__forward__ascend(ctx,
self,
feats,
rois,
roi_scale_factor=None):
"""Rewrite `forward` of `SingleRoIExtractor` for Ascend backend.
This function uses RoiExtractor op for Ascend deployment.
"""
featmap_strides = self.featmap_strides
finest_scale = self.finest_scale
for roi_layer in self.roi_layers:
assert isinstance(
roi_layer,
RoIAlign), f'{type(roi_layer)} is not supported in Ascend.'
roi_layer = self.roi_layers[0]
out_size = roi_layer.output_size
sampling_ratio = roi_layer.sampling_ratio
pool_mode = roi_layer.pool_mode
aligned = roi_layer.aligned
if roi_scale_factor is None:
roi_scale_factor = 1.0
featmap_strides = [float(s) for s in featmap_strides]
return AscendRoiExtractor.apply(*feats, rois, out_size, pool_mode,
sampling_ratio, roi_scale_factor,
finest_scale, featmap_strides, aligned)
@FUNCTION_REWRITER.register_rewriter(
func_name='mmdet.models.roi_heads.SingleRoIExtractor.forward')
@mark('roi_extractor', inputs=['feats', 'rois'], outputs=['bbox_feats'])

View File

@ -44,7 +44,17 @@ def roi_align_default(ctx, g, input: Tensor, rois: Tensor,
backend = get_backend(ctx.cfg)
if backend == Backend.PPLNN:
domain = 'mmcv'
elif backend == Backend.ONNXRUNTIME:
return g.op(
f'{domain}::MMCVRoiAlign',
input,
rois,
output_height_i=output_size[0],
output_width_i=output_size[1],
spatial_scale_f=spatial_scale,
sampling_ratio_i=sampling_ratio,
mode_s=pool_mode,
aligned_i=aligned)
else:
from torch.onnx.symbolic_opset9 import _cast_Long
from torch.onnx.symbolic_opset11 import add, select, squeeze
batch_indices = _cast_Long(
@ -96,15 +106,3 @@ def roi_align_default(ctx, g, input: Tensor, rois: Tensor,
sampling_ratio_i=sampling_ratio,
mode_s=pool_mode,
aligned_i=aligned)
else:
domain = 'mmdeploy'
return g.op(
f'{domain}::MMCVRoiAlign',
input,
rois,
output_height_i=output_size[0],
output_width_i=output_size[1],
spatial_scale_f=spatial_scale,
sampling_ratio_i=sampling_ratio,
mode_s=pool_mode,
aligned_i=aligned)

View File

@ -13,6 +13,7 @@ from .masked_fill import masked_fill__onnxruntime
from .normalize import normalize__ncnn
from .repeat import tensor__repeat__tensorrt
from .size import tensor__size__ncnn
from .tensor_getitem import tensor__getitem__ascend
from .tensor_setitem import tensor__setitem__default
from .topk import topk__dynamic, topk__tensorrt
from .triu import triu__default
@ -23,6 +24,7 @@ __all__ = [
'tensor__size__ncnn', 'topk__dynamic', 'topk__tensorrt', 'chunk__ncnn',
'triu__default', 'atan2__default', 'normalize__ncnn', 'expand__ncnn',
'chunk__torchscript', 'masked_fill__onnxruntime',
'tensor__setitem__default', 'adaptive_avg_pool2d__default',
'adaptive_avg_pool2d__ncnn', 'multi_head_attention_forward'
'tensor__setitem__default', 'tensor__getitem__ascend',
'adaptive_avg_pool2d__default', 'adaptive_avg_pool2d__ncnn',
'multi_head_attention_forward'
]

View File

@ -22,3 +22,20 @@ def tensor__size__ncnn(ctx, self, *args):
ret = [int(r) for r in ret]
ret = tuple(ret)
return ret
@FUNCTION_REWRITER.register_rewriter(
func_name='torch.Tensor.size', backend='ascend')
def tensor__size__ascend(ctx, self, *args):
"""Rewrite `size` for ascens backend.
Support negative index.
"""
if len(args) != 0:
index = args[0]
if index < 0:
index = self.dim() + index
args = (index, )
return ctx.origin_func(self, *args)

View File

@ -0,0 +1,41 @@
# Copyright (c) OpenMMLab. All rights reserved.
from typing import Iterable
import torch
from mmdeploy.core import FUNCTION_REWRITER
@FUNCTION_REWRITER.register_rewriter(
func_name='torch.Tensor.__getitem__', backend='ascend')
def tensor__getitem__ascend(ctx, self, key) -> torch.Tensor:
"""Rewrite `getitem` for ascend backend.
Ascend does not support negative select
"""
if not isinstance(key, (tuple, list)):
if isinstance(key, int) and key < 0:
key = self.dim() + key
return ctx.origin_func(self, key)
def _num_slice_types(slices):
num_slice = 0
for s in slices:
if isinstance(s, slice) or isinstance(s, int) or isinstance(
s, Iterable):
num_slice += 1
return num_slice
shape = self.shape
new_key = list(key)
num_ellipsis = len(shape) - _num_slice_types(new_key)
dim_count = 0
for i, k in enumerate(new_key):
if isinstance(k, int):
if k < 0:
new_key[i] = shape[dim_count] + k
if k == Ellipsis:
dim_count = dim_count + num_ellipsis
elif k is not None:
dim_count += 1
return ctx.origin_func(self, new_key)

View File

@ -59,6 +59,7 @@ class Backend(AdvancedEnum):
OPENVINO = 'openvino'
SDK = 'sdk'
TORCHSCRIPT = 'torchscript'
ASCEND = 'ascend'
DEFAULT = 'default'

View File

@ -46,6 +46,8 @@ def backend_checker(backend: Backend, require_plugin: bool = False):
from mmdeploy.apis.ncnn import is_custom_ops_available
elif backend == Backend.OPENVINO:
from mmdeploy.apis.openvino import is_available
elif backend == Backend.ASCEND:
from mmdeploy.apis.ascend import is_available
else:
warnings.warn('The backend checker is not available')
return
@ -96,6 +98,8 @@ def check_backend(backend: Backend, require_plugin: bool = False):
from mmdeploy.apis.openvino import is_available
elif backend == Backend.TORCHSCRIPT:
from mmdeploy.backend.torchscript import ops_available as is_available
elif backend == Backend.ASCEND:
from mmdeploy.backend.ascend import is_available
else:
warnings.warn('The backend checker is not available')
return
@ -537,6 +541,20 @@ def get_backend_outputs(ir_file_path: str,
backend_files = [ir_file_path]
device = 'cpu'
backend_feats = [v for _, v in model_inputs.items()]
elif backend == Backend.ASCEND:
# Ascend model conversion
import mmdeploy.apis.ascend as ascend_apis
from mmdeploy.utils import get_model_inputs
if not ascend_apis.is_available():
return None
work_dir = osp.split(ir_file_path)[0]
# convert model
convert_args = get_model_inputs(deploy_cfg)
ascend_apis.from_onnx(ir_file_path, work_dir, convert_args[0])
om_file_name = osp.splitext(osp.split(ir_file_path)[1])[0]
backend_files = [osp.join(work_dir, om_file_name + '.om')]
backend_feats = flatten_model_inputs
device = 'cpu'
else:
raise NotImplementedError(
f'Unimplemented backend type: {backend.value}')

View File

@ -15,3 +15,6 @@ known_third_party = h5py,m2r,mmcls,mmcv,mmdeploy_python,mmdet,mmedit,mmocr,mmseg
no_lines_before = STDLIB,LOCALFOLDER
default_section = THIRDPARTY
skip = service/snpe/client/inference_pb2.py,service/snpe/client/inference_pb2_grpc.py
[codespell]
ignore-words=.codespell_ignore.txt

View File

@ -0,0 +1,71 @@
# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
import tempfile
import mmcv
import pytest
import torch
import torch.nn as nn
from mmdeploy.utils import Backend
from mmdeploy.utils.test import backend_checker
onnx_file = tempfile.NamedTemporaryFile(suffix='.onnx').name
test_img = torch.rand([1, 3, 8, 8])
@pytest.mark.skip(reason='This a not test class but a utility class.')
class TestModel(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return x * 0.5
test_model = TestModel().eval()
def generate_onnx_file(model):
with torch.no_grad():
dynamic_axes = {
'input': {
0: 'batch',
2: 'width',
3: 'height'
},
'output': {
0: 'batch'
}
}
torch.onnx.export(
model,
test_img,
onnx_file,
output_names=['output'],
input_names=['input'],
keep_initializers_as_inputs=True,
do_constant_folding=True,
verbose=False,
opset_version=11,
dynamic_axes=dynamic_axes)
assert osp.exists(onnx_file)
@backend_checker(Backend.ASCEND)
def test_onnx2ascend():
from mmdeploy.apis.ascend import from_onnx
model = test_model
generate_onnx_file(model)
work_dir, _ = osp.split(onnx_file)
file_name = osp.splitext(onnx_file)[0]
om_path = osp.join(work_dir, file_name + '.om')
model_inputs = mmcv.Config(
dict(
dynamic_batch_size=[1, 2, 4],
input_shapes=dict(input=[-1, 3, 224, 224])))
from_onnx(onnx_file, work_dir, model_inputs)
assert osp.exists(work_dir)
assert osp.exists(om_path)

View File

@ -103,6 +103,18 @@ def onnx2backend(backend, onnx_file):
work_dir = backend_dir
from_onnx(onnx_file, work_dir, input_info, output_names)
return backend_file
elif backend == Backend.ASCEND:
import mmcv
from mmdeploy.apis.ascend import from_onnx
backend_dir = tempfile.TemporaryDirectory().name
work_dir = backend_dir
file_name = osp.splitext(osp.split(onnx_file)[1])[0]
backend_file = osp.join(work_dir, file_name + '.om')
model_inputs = mmcv.Config(
dict(input_shapes=dict(input=test_img.shape)))
from_onnx(onnx_file, work_dir, model_inputs)
return backend_file
def create_wrapper(backend, model_files):
@ -133,6 +145,10 @@ def create_wrapper(backend, model_files):
torchscript_model = TorchscriptWrapper(
model_files, input_names=input_names, output_names=output_names)
return torchscript_model
elif backend == Backend.ASCEND:
from mmdeploy.backend.ascend import AscendWrapper
ascend_model = AscendWrapper(model_files)
return ascend_model
else:
raise NotImplementedError(f'Unknown backend type: {backend.value}')
@ -163,13 +179,16 @@ def run_wrapper(backend, wrapper, input):
elif backend == Backend.TORCHSCRIPT:
results = wrapper({'input': input})['output']
return results
elif backend == Backend.ASCEND:
results = wrapper({'input': input})['output']
return results
else:
raise NotImplementedError(f'Unknown backend type: {backend.value}')
ALL_BACKEND = [
Backend.TENSORRT, Backend.ONNXRUNTIME, Backend.PPLNN, Backend.NCNN,
Backend.OPENVINO, Backend.TORCHSCRIPT
Backend.OPENVINO, Backend.TORCHSCRIPT, Backend.ASCEND
]

View File

@ -343,3 +343,49 @@ def test__anchorgenerator__single_level_grid_priors():
find_trt_grid_priors = True
assert find_trt_grid_priors
@backend_checker(Backend.ASCEND)
def test_multiclass_nms__ascend():
from mmdeploy.codebase.mmdet.core import multiclass_nms
deploy_cfg = mmcv.Config(
dict(
onnx_config=dict(
input_names=['boxes', 'scores'],
output_names=['dets', 'labels'],
input_shape=None),
backend_config=dict(
type='ascend',
model_inputs=[
dict(input_shapes=dict(boxes=[1, 5, 4], scores=[1, 5, 8]))
]),
codebase_config=dict(
type='mmdet',
task='ObjectDetection',
post_processing=dict(
score_threshold=0.05,
iou_threshold=0.5,
max_output_boxes_per_class=20,
pre_top_k=-1,
keep_top_k=10,
background_label_id=-1,
))))
boxes = torch.rand(1, 5, 4)
scores = torch.rand(1, 5, 8)
max_output_boxes_per_class = 20
keep_top_k = 10
wrapped_func = WrapFunction(
multiclass_nms,
max_output_boxes_per_class=max_output_boxes_per_class,
keep_top_k=keep_top_k)
rewrite_outputs, _ = get_rewrite_outputs(
wrapped_func,
model_inputs={
'boxes': boxes,
'scores': scores
},
deploy_cfg=deploy_cfg)
assert rewrite_outputs is not None, 'Got unexpected rewrite '\
'outputs: {}'.format(rewrite_outputs)

View File

@ -638,6 +638,73 @@ def test_single_roi_extractor(backend_type: Backend):
model_output, backend_output, rtol=1e-03, atol=1e-05)
def test_single_roi_extractor__ascend():
check_backend(Backend.ASCEND)
# create wrap function
from mmdeploy.utils.test import WrapFunction
single_roi_extractor = get_single_roi_extractor()
out_channels = single_roi_extractor.out_channels
def single_roi_extractor_func(feat0, feat1, feat2, feat3, rois):
return single_roi_extractor([feat0, feat1, feat2, feat3], rois)
single_roi_extractor_wrapper = WrapFunction(single_roi_extractor_func)
# generate data
seed_everything(1234)
feats = [
torch.rand((1, out_channels, 200, 336)),
torch.rand((1, out_channels, 100, 168)),
torch.rand((1, out_channels, 50, 84)),
torch.rand((1, out_channels, 25, 42)),
]
seed_everything(5678)
rois = torch.tensor([[0.0000, 587.8285, 52.1405, 886.2484, 341.5644]])
# create config
input_names = ['feat0', 'feat1', 'feat2', 'feat3', 'rois']
output_names = ['roi_feat']
model_inputs = dict(zip(input_names, feats + [rois]))
deploy_cfg = mmcv.Config(
dict(
backend_config=dict(
type=Backend.ASCEND.value,
model_inputs=[
dict(
input_shapes=dict(
feat0=feats[0].shape,
feat1=feats[1].shape,
feat2=feats[2].shape,
feat3=feats[3].shape,
rois=rois.shape))
]),
onnx_config=dict(
input_names=input_names,
output_names=output_names,
input_shape=None),
codebase_config=dict(
type='mmdet',
task='ObjectDetection',
)))
# get torch output
model_outputs = get_model_outputs(single_roi_extractor_wrapper, 'forward',
model_inputs)
# get backend output
backend_outputs, _ = get_rewrite_outputs(
wrapped_model=single_roi_extractor_wrapper,
model_inputs=model_inputs,
deploy_cfg=deploy_cfg)
if isinstance(backend_outputs, dict):
backend_outputs = backend_outputs.values()
for model_output, backend_output in zip(model_outputs[0], backend_outputs):
model_output = model_output.squeeze().cpu().numpy()
backend_output = backend_output.squeeze()
assert model_output.shape == backend_output.shape
def get_cascade_roi_head(is_instance_seg=False):
"""CascadeRoIHead Config."""
num_stages = 3

View File

@ -184,6 +184,32 @@ def test_size_of_tensor_static():
'outputs: {}'.format(rewrite_outputs)
@backend_checker(Backend.ASCEND)
def test_size__ascend():
def model_func(input):
x = torch.Tensor.size(input, -1)
return torch.tensor(x)
input = torch.zeros([1, 2, 3, 4])
deploy_cfg_ascend = mmcv.Config(
dict(
onnx_config=dict(input_shape=None),
backend_config=dict(
type='ascend',
model_inputs=[dict(input_shapes=dict(input=input.shape))]),
codebase_config=dict(type='mmdet', task='ObjectDetection')))
wrapped_func = WrapFunction(model_func)
rewrite_outputs, _ = get_rewrite_outputs(
wrapped_func,
model_inputs={'input': input},
deploy_cfg=deploy_cfg_ascend,
run_with_backend=True)
assert rewrite_outputs is not None, 'Got unexpected rewrite '
'outputs: {}'.format(rewrite_outputs)
class TestTopk:
input = torch.rand(1, 5, 5, 5)
@ -286,6 +312,32 @@ def test_normalize_ncnn(input, dim):
assert osp.exists(bin_path)
@backend_checker(Backend.ASCEND)
def test_getitem__ascend():
input = torch.rand(1, 2, 3)
def tensor_getitem(x):
return x[..., -1]
# create wrapped model
wrapped_func = WrapFunction(tensor_getitem)
import tempfile
import onnx
from mmdeploy.core import RewriterContext
onnx_file = tempfile.NamedTemporaryFile(suffix='onnx').name
# convert model
with RewriterContext(
cfg={}, backend=Backend.ASCEND.value, opset=11), torch.no_grad():
torch.onnx.export(wrapped_func, input, onnx_file, opset_version=11)
onnx_model = onnx.load(onnx_file)
nodes = onnx_model.graph.node
assert nodes is not None
@backend_checker(Backend.ONNXRUNTIME)
@pytest.mark.parametrize(
'input',

View File

@ -44,6 +44,9 @@ def check_backend():
import mmdeploy.apis.snpe as snpe_apis
logger.info(f'snpe_is_available: {snpe_apis.is_available()}')
import mmdeploy.apis.ascend as ascend_apis
logger.info(f'ascend_is_available: {ascend_apis.is_available()}')
def check_codebase():
codebase_versions = get_codebase_version()

View File

@ -204,7 +204,7 @@ def main():
from mmdeploy.apis.tensorrt import onnx2tensorrt
PIPELINE_MANAGER.enable_multiprocess(True, [onnx2tensorrt])
PIPELINE_MANAGER.set_log_level(logging.INFO, [onnx2tensorrt])
PIPELINE_MANAGER.set_log_level(log_level, [onnx2tensorrt])
backend_files = []
for model_id, model_param, onnx_path in zip(
@ -331,7 +331,7 @@ def main():
from mmdeploy.apis.pplnn import from_onnx
pplnn_pipeline_funcs = [from_onnx]
PIPELINE_MANAGER.set_log_level(logging.INFO, pplnn_pipeline_funcs)
PIPELINE_MANAGER.set_log_level(log_level, pplnn_pipeline_funcs)
pplnn_files = []
for onnx_path in ir_files:
@ -351,13 +351,32 @@ def main():
pplnn_files += [onnx_path, algo_file]
backend_files = pplnn_files
elif backend == Backend.ASCEND:
from mmdeploy.apis.ascend import from_onnx
ascend_pipeline_funcs = [from_onnx]
PIPELINE_MANAGER.set_log_level(log_level, ascend_pipeline_funcs)
model_inputs = get_model_inputs(deploy_cfg)
om_files = []
for model_id, onnx_path in enumerate(ir_files):
om_path = osp.splitext(onnx_path)[0] + '.om'
from_onnx(onnx_path, args.work_dir, model_inputs[model_id])
om_files.append(om_path)
backend_files = om_files
if args.dump_info:
from mmdeploy.backend.ascend import update_sdk_pipeline
update_sdk_pipeline(args.work_dir)
if args.test_img is None:
args.test_img = args.img
headless = False
# check headless or not for all platforms.
import tkinter
try:
import tkinter
tkinter.Tk()
except Exception:
headless = True

View File

@ -119,6 +119,7 @@ def main():
is_device_cpu = (args.device == 'cpu')
device_id = None if is_device_cpu else parse_device_id(args.device)
destroy_model = model.destroy
model = MMDataParallel(model, device_ids=[device_id])
# The whole dataset test wrapped a MMDataParallel class outside the module.
# As mmcls.apis.test.py single_gpu_test defined, the MMDataParallel needs
@ -142,6 +143,8 @@ def main():
task_processor.evaluate_outputs(model_cfg, outputs, dataset, args.metrics,
args.out, args.metric_options,
args.format_only, args.log2file)
# only effective when the backend requires explicit clean-up (e.g. Ascend)
destroy_model()
if __name__ == '__main__':