HyperAI

1. Tutorial Introduction

YOLOv11 is the latest iteration of the real-time object detector developed by the Ultralytics team. It was announced at the YOLO Vision 2024 (YV24) event on September 30, 2024. YOLOv11 has made significant improvements in accuracy, speed, and efficiency, making it a powerful tool for computer vision tasks. The launch of YOLOv11 is intended to simplify the development process and provide a cornerstone for subsequent integration. Compared with previous YOLO model versions, it has made significant improvements in architecture and training methods, making it a common choice in various computer vision tasks. The launch of YOLOv11 heralds a new milestone in object detection technology. It not only sets a new benchmark in speed and accuracy, but more importantly, its innovative model architecture design makes complex object detection tasks within reach.

In addition, the installation process of YOLOv11 is relatively simple. Developers can download the latest source code from its GitHub page and follow the guide to perform command line tests for model prediction. This tutorial uses YOLOv11, and the model and related environment have been installed. You can directly clone and open the API address to perform model reasoning and achieve image detection, segmentation, pose estimation, tracking and classification.

The main improvements of YOLOv11 include:

Enhanced feature extraction: Improved backbone and neck architecture for more accurate object detection.
Optimized processing speed: New architecture design and training methods enable faster processing speed.
Higher accuracy with fewer parameters: On the COCO dataset, YOLOv11m achieves higher average precision (mAP) than YOLOv8m while using fewer parameters.
Strong environmental adaptability: YOLOv11 can be deployed in a variety of environments, including edge devices, cloud platforms, and systems that support NVIDIA GPUs.
Support for a wide range of tasks: YOLOv11 supports a variety of computer vision tasks such as object detection, instance segmentation, image classification, pose estimation, and oriented object detection (OBB).

YOLO Development History

YOLO (You Only Look Once) is a popular object detection and image segmentation model developed by Joseph Redmon and Ali Farhadi at the University of Washington. YOLO was launched in 2015 and quickly gained popularity for its high speed and accuracy.

YOLOv2, released in 2016, improved upon the original model by incorporating batch normalization, anchor boxes, and dimension clustering.
YOLOv3, launched in 2018, further enhanced the performance of the model using a more efficient backbone network, multi-anchor, and spatial pyramid pooling.
YOLOv4 was released in 2020, introducing innovations such as Mosaic data augmentation, a new anchor-free detection head, and a new loss function.
- Tutorial:DeepSOCIAL realizes crowd distance monitoring based on YOLOv4 and sort multi-target tracking
YOLOv5 further improves the performance of the model and adds new features such as hyperparameter optimization, integrated experiment tracking, and automatic export to common export formats.
- Tutorial:YOLOv5_deepsort real-time multi-target tracking model
YOLOv6 was open-sourced by Meituan in 2022 and is currently used in many of the company's autonomous delivery robots.
YOLOv7 adds additional tasks such as pose estimation on the COCO keypoint dataset.
- Tutorial:How to train and use a custom YOLOv7 model
YOLOv8 was released by Ultralytics in 2023. YOLOv8 introduces new features and improvements to enhance performance, flexibility, and efficiency, supporting a full range of visual AI tasks.
- Tutorial:Training YOLOv8 with custom data
YOLOv9 introduces innovative methods such as Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN).
YOLOv10 was created by researchers at Tsinghua University using the UltralyticsPython package. This version achieved advancements in real-time object detection by introducing an end-to-end head, removing the non-maximum suppression (NMS) requirement.
- Tutorial:YOLOv10 real-time end-to-end object detection
YOLOv11 🚀 NEW: Ultralytics’ latest YOLO model delivers state-of-the-art (SOTA) performance in multiple tasks including detection, segmentation, pose estimation, tracking, and classification, leveraging capabilities from a wide range of AI applications and domains.
- Tutorial:One-click deployment of YOLOv11

2. Operation steps

After starting the container, click the API address to enter the Web interface

This tutorial contains 5 functions:

Object Detection
Instance Segmentation
Image Classification
Pose Estimation
Positioning object detection

1. Object Detection

Object DetectorThe output of is a set of bounding boxes that enclose the objects in the image, along with a class label and a confidence score for each bounding box. Object detection is a good choice if you need to identify objects of interest in a scene but don’t need to know their exact location or shape.

2. Instance Segmentation

Instance Segmentation ModelThe output of is a set of masks or outlines that outline each object in the image, along with a class label and confidence score for each object. Instance segmentation is very useful when you need to know not only where objects are in the image, but also their specific shapes.

3. Image Classification

The output of an image classifier is a single class label and a confidence score. Image classification is useful when you only need to know which class an image belongs to, without knowing the location or exact shape of the objects in that class.

4. Pose Estimation

Pose estimation is a task that involves identifying the locations of specific points (often called keypoints) in an image. Keypoints can represent parts of an object, such as joints, landmarks, or other salient features. The locations of keypoints are usually represented by a set of 2D [x, y] or 3D [x, y, visible] coordinates.

The output of a pose estimation model is a set of points representing key points of objects in the image, and typically also includes a confidence score for each point. Pose estimation is a good choice when you need to identify specific parts of objects in a scene and their positional relationship to each other.

5. Directed object detection

Oriented object detection goes a step further than object detection by introducing an additional angle to more accurately locate objects in an image.

The output of an oriented object detector is a set of rotated bounding boxes that accurately enclose the objects in the image, along with a class label and confidence score for each bounding box. Object detection is a good choice when you need to identify objects of interest in a scene but don’t need to know their exact location or shape.

Exchange and discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓