mmdeploy

1.4 KiB

Raw Blame History

Frequently Asked Questions

TensorRT

"WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."

Fp16 mode requires a device with full-rate fp16 support.

"error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]"

When building an ICudaEngine from an INetworkDefinition that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:

backend_config = dict(
  common_config=dict(max_workspace_size=1 << 30),
  model_inputs=[
      dict(
          input_shapes=dict(
              input=dict(
                  min_shape=[1, 3, 320, 320],
                  opt_shape=[1, 3, 800, 1344],
                  max_shape=[1, 3, 1344, 1344])))
  ])

The input tensor shape should be limited between min_shape and max_shape.

"error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS"

TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.

1.4 KiB Raw Blame History

Frequently Asked Questions

TensorRT

1.4 KiB

Raw Blame History