Apple MPS -> CPU NMS fallback strategy (#9600)

Until more ops are fully supported this update will allow for seamless MPS inference (but slower MPS to CPU transfer before NMS, so slower NMS times).

Partially resolves https://github.com/ultralytics/yolov5/issues/9596

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
pull/9610/head
Glenn Jocher 2022-09-26 14:13:03 +02:00 committed by GitHub
parent bd9c0c42ae
commit c4c0ee8fc3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 0 deletions

View File

@ -843,6 +843,8 @@ def non_max_suppression(
if isinstance(prediction, (list, tuple)): # YOLOv5 model in validation model, output = (inference_out, loss_out)
prediction = prediction[0] # select only inference output
if 'mps' in prediction.device.type: # MPS not fully supported yet, convert tensors to CPU before NMS
prediction = prediction.cpu()
bs = prediction.shape[0] # batch size
nc = prediction.shape[2] - nm - 5 # number of classes
xc = prediction[..., 4] > conf_thres # candidates