HyperAI

One-click Deployment of YOLOv12

🚀 YOLOv12: A versatile choice for computer vision tasks, with both speed and accuracy at its peak! 🚀

License

This tutorial uses RTX 4090 as the computing resource.

1. Tutorial Introduction 📖

YOLOv12 was launched in 2025 by researchers from the University of Buffalo and the University of the Chinese Academy of Sciences.YOLOv12: Attention-Centric Real-Time Object Detectors".

For a long time, enhancing the network architecture of the YOLO framework has been a core topic in the field of computer vision. Although the attention mechanism has performed well in modeling capabilities, CNN-based improvements are still the mainstream because attention-based models are difficult to match in speed. However, the launch of YOLOv12 has changed this situation! Not only is it comparable to CNN-based frameworks in speed, it also fully utilizes the performance advantages of the attention mechanism and becomes a new benchmark for real-time object detection.

YOLOv12’s Breakthrough Performance

  • YOLOv12-N achieves a mAP of 40.6% with an inference latency of 1.64 milliseconds on a T4 GPU, which is 2.1%/1.2% higher than YOLOv10-N/YOLOv11-N.
  • YOLOv12-S beats RT-DETR-R18 / RT-DETRv2-R18, running 42% faster, using only 36% of computation, and reducing parameters by 45%.

📜 YOLO development history and related tutorials

YOLO (You Only Look Once) has been a leader in target detection and image segmentation since its launch in 2015. The following is the evolution of the YOLO series:

  • YOLOv2 (2016): Introducing batch normalization, anchor boxes, and dimension clustering.
  • YOLOv3 (2018): Using more efficient backbone networks, multi-anchors and spatial pyramid pooling.
  • YOLOv4 (2020): Introducing Mosaic data augmentation, anchor-free detection head and new loss function. → Tutorial:DeepSOCIAL realizes crowd distance monitoring based on YOLOv4 and sort multi-target tracking
  • YOLOv5: Added hyperparameter optimization, experiment tracking, and automatic export capabilities. → Tutorial:YOLOv5_deepsort real-time multi-target tracking model
  • YOLOv6 (2022): Meituan open source, widely used in autonomous delivery robots.
  • YOLOv7: Supports pose estimation for the COCO keypoint dataset.
  • YOLOv8 (2023): Ultralytics is released, supporting a full range of visual AI tasks.
  • YOLOv9: Introducing Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN).
  • YOLOv10: Tsinghua University introduced an end-to-end header and eliminated the non-maximum suppression (NMS) requirement. → Tutorial:YOLOv10 real-time end-to-end object detection
  • YOLOv11:Ultralytics latest model, supporting detection, segmentation, pose estimation, tracking and classification. → Tutorial:One-click deployment of YOLOv11
  • YOLOv12 🚀 NEW: The dual peaks of speed and accuracy, combined with the performance advantages of the attention mechanism!

2. Operation steps🛠️

1. After starting the container, click the API address to enter the Web interface

  The output of an object detector is a set of bounding boxes that enclose the objects in the image, along with a class label and a confidence score for each bounding box. Object detection is a good choice if you need to identify objects of interest in a scene but don’t need to know their exact location or shape.

It is divided into the following two functions:

  • Image detection
  • Video Detection

2. Image detection

The input is an image and the output is an image with labels.

Figure 1 Image detection

3. Video Detection

The input is a video and the output is a video with labels.

Figure 2 Video detection

🤝 Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓


YOLOv12 is not only a technological leap, but also a revolution in the field of computer vision! Come and experience it! 🚀