mmdeploy/docs/en/faq.md

## Frequently Asked Questions

### TensorRT

- "WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."

  Fp16 mode requires a device with full-rate fp16 support.

- "error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]"

  When building an `ICudaEngine` from an `INetworkDefinition` that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:

  ```python
  backend_config = dict(
    common_config=dict(max_workspace_size=1 << 30),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 320, 320],
                    opt_shape=[1, 3, 800, 1344],
                    max_shape=[1, 3, 1344, 1344])))
    ])
  ```

  The input tensor shape should be limited between `min_shape` and `max_shape`.

- "error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS"

  TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.
[Docs]: Add sphinx document (#109) * Add document structure * better zh_cn document * add build example, update requirement * add readme * add usage * fix end of file * fix codebase names, add TODO 2021-10-09 14:10:42 +08:00			`## Frequently Asked Questions`
[Doc] Update FAQ for TensorRT (#96) * update FAQ * comment 2022-01-26 16:06:44 +08:00
			`### TensorRT`

			`- "WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."`

			`Fp16 mode requires a device with full-rate fp16 support.`

			`- "error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]"`

			When building an `ICudaEngine` from an `INetworkDefinition` that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:

			```python
			`backend_config = dict(`
			`common_config=dict(max_workspace_size=1 << 30),`
			`model_inputs=[`
			`dict(`
			`input_shapes=dict(`
			`input=dict(`
			`min_shape=[1, 3, 320, 320],`
			`opt_shape=[1, 3, 800, 1344],`
			`max_shape=[1, 3, 1344, 1344])))`
			`])`
			```

			The input tensor shape should be limited between `min_shape` and `max_shape`.

			`- "error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS"`

			`TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.`