Apple MPS -> CPU NMS fallback strategy (#9600)

Until more ops are fully supported this update will allow for seamless MPS inference (but slower MPS to CPU transfer before NMS, so slower NMS times). Partially resolves https://github.com/ultralytics/yolov5/issues/9596 Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2025-06-03 14:49:29 +08:00 · 2022-09-26 14:13:03 +02:00 · 2022-09-26 14:13:03 +02:00 · c4c0ee8fc3
commit c4c0ee8fc3
parent bd9c0c42ae
1 changed files with 2 additions and 0 deletions
--- a/utils/general.py
+++ b/utils/general.py
@ -843,6 +843,8 @@ def non_max_suppression(
    if isinstance(prediction, (list, tuple)):  # YOLOv5 model in validation model, output = (inference_out, loss_out)
        prediction = prediction[0]  # select only inference output

+    if 'mps' in prediction.device.type:  # MPS not fully supported yet, convert tensors to CPU before NMS
+        prediction = prediction.cpu()
    bs = prediction.shape[0]  # batch size
    nc = prediction.shape[2] - nm - 5  # number of classes
    xc = prediction[..., 4] > conf_thres  # candidates