ultralytics/examples/YOLOv8-SAHI-Inference-Video
..
README.md
yolov8_sahi.py

README.md

YOLO11 with SAHI for Video Inference

Slicing Aided Hyper Inference (SAHI) is a powerful technique designed to optimize object detection algorithms, particularly for large-scale and high-resolution imagery. It works by partitioning images or video frames into manageable slices, performing detection on each slice using models like Ultralytics YOLO11, and then intelligently merging the results. This approach significantly improves detection accuracy for small objects and maintains performance on high-resolution inputs.

This tutorial guides you through running Ultralytics YOLO11 inference on video files using the SAHI library for enhanced detection capabilities. For a detailed guide on using SAHI with Ultralytics models, see the SAHI Tiled Inference guide.

📋 Table of Contents

⚙️ Step 1: Install Required Libraries

First, clone the Ultralytics repository to access the example script. Then, install the necessary Python packages, including sahi and ultralytics, using pip. Finally, navigate into the example directory.

# Clone the ultralytics repository
git clone https://github.com/ultralytics/ultralytics

# Install dependencies
# Ensure you have Python 3.8 or later installed
pip install -U sahi ultralytics opencv-python

# Change directory to the example folder
cd ultralytics/examples/YOLOv8-SAHI-Inference-Video

🚀 Step 2: Run Inference with SAHI using Ultralytics YOLO11

Once the setup is complete, you can run inference on your video file. The provided script yolov8_sahi.py leverages SAHI for tiled inference with a specified YOLO11 model.

Execute the script using the command line, specifying the path to your video file. You can also choose different YOLO11 model weights.

# Run inference and save the output video with bounding boxes
python yolov8_sahi.py --source "path/to/your/video.mp4" --save-img

# Run inference using a specific YOLO11 model (e.g., yolo11n.pt) and save results
python yolov8_sahi.py --source "path/to/your/video.mp4" --save-img --weights "yolo11n.pt"

# Run inference with smaller slices for potentially better small object detection
python yolov8_sahi.py --source "path/to/your/video.mp4" --save-img --slice-height 512 --slice-width 512

This script processes the video frame by frame, applying SAHI's slicing and inference logic before stitching the detections back onto the original frame dimensions. The output, annotated images with detections, will be saved in the runs/detect/predict directory. Learn more about prediction with Ultralytics models in the Predict mode documentation.

🛠️ Usage Options

The script yolov8_sahi.py accepts several command-line arguments to customize the inference process:

  • --source: Required. Path to the input video file (e.g., "../path/to/video.mp4").
  • --weights: Optional. Path to the YOLO11 model weights file (e.g., "yolo11n.pt", "yolo11s.pt"). Defaults to "yolo11n.pt". You can download various models or use your custom-trained ones. See Ultralytics YOLO models for more options.
  • --save-vid: Optional. Flag to save the output video with detection results. Saved to runs/detect/predict.
  • --slice-height: Optional. Height of each image slice for SAHI. Defaults to 1024.
  • --slice-width: Optional. Width of each image slice for SAHI. Defaults to 1024.

Experiment with these options, especially slice dimensions, to optimize detection performance for your specific video processing task and target object sizes. Using appropriate datasets for training can also significantly impact performance.

Contribute

Contributions to enhance this example or add new features are welcome! If you encounter issues or have suggestions, please open an issue or submit a pull request in the Ultralytics GitHub repository. Check out our contribution guide for more details.