2.5 KiB
Quick Start
In terms of model deployment, most ML models require some preprocessing steps on the input data and postprocessing steps on the output to get structured output. MMDeploy sdk provides a lot of pre-processing and post-processing process. When you convert and deploy a model, you can enjoy the convenience brought by mmdeploy sdk.
Model Conversion
You can refer to convert model for more details.
After model conversion with --dump-info
, the structure of model directory (tensorrt model) is as follows. If you convert to other backend, the structure will be slightly different. The two images are for quick conversion validation.
├── deploy.json
├── detail.json
├── pipeline.json
├── end2end.onnx
├── end2end.engine
├── output_pytorch.jpg
└── output_tensorrt.jpg
The files related to sdk are:
- deploy.json // model information.
- pipeline.json // inference information.
- end2end.engine // model file for tensort, will be different for other backends.
SDK can read the model directory directly or you can pack the related files to zip archive for better distribution or encryption. To read the zip file, the sdk should build with -DMMDEPLOY_ZIP_MODEL=ON
SDK Inference
Generally speaking, there are three steps to inference a model.
- Create a pipeline
- Load the data
- Model inference
We use classifier
as an example to show these three steps.
Create a pipeline
Load model from disk
std::string model_path = "/data/resnet"; // or "/data/resnet.zip" if build with `-DMMDEPLOY_ZIP_MODEL=ON`
mmdeploy_model_t model;
mmdeploy_model_create_by_path(model_path, &model);
mmdeploy_classifier_t classifier{};
mmdeploy_classifier_create(model, "cpu", 0, &classifier);
Load model from memory
std::string model_path = "/data/resnet.zip"
std::ifstream ifs(model_path, std::ios::binary); // /path/to/zipmodel
ifs.seekg(0, std::ios::end);
auto size = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::string str(size, '\0'); // binary data, should decrypt if it's encrypted
ifs.read(str.data(), size);
mmdeploy_model_t model;
mmdeploy_model_create(str.data(), size, &model);
mmdeploy_classifier_t classifier{};
mmdeploy_classifier_create(model, "cpu", 0, &classifier);
Load the data
cv::Mat img = cv::imread(image_path);
Model inference
mmdeploy_classification_t* res{};
int* res_count{};
mmdeploy_classifier_apply(classifier, &mat, 1, &res, &res_count);