HyperAI

Online Tutorial丨Important Innovation in the YOLO Series! Tsinghua Team Releases YOLOE, Which Directly Detects and Segment Objects in Open Scenes in Real Time

特色图像

YOLO (You Only Look Once) has become one of the most influential real-time object detection models in the field of computer vision since its first release in 2015. This end-to-end object detection technology based on a one-stage detection architecture has been updated more than ten versions in 10 years. With its real-time processing of high-precision and high-frame-rate images, it is widely used in many fields such as autonomous driving, medical image analysis, and robotic vision.

However, although the traditional YOLO series models use convolutional neural networks to achieve high-performance real-time detection,However, it relies on predefined target categories and lacks flexibility in practical open scenarios.

To address this problem, the Tsinghua University team, based on YOLO,The open object detection and segmentation model YOLOE is proposed, which supports three scenarios: text prompts, visual cues, and prompt-free.This multimodal capability enables it to understand language commands, see images, and even discover new things independently, truly achieving "seeing everything in real time."

Currently, the tutorial section of HyperAI's official website has launched a one-click deployment tutorial "YOLOE: See Everything in Real Time". Interested friends, come and experience it!

Tutorial Link:

https://go.hyper.ai/U2PXt

Click to view the super complete YOLO series tutorial: Online Tutorials | YOLO series has been updated with 11 versions in 10 years, and the latest model has reached SOTA in multiple target detection tasks

Demo Run

1. Log in to hyper.ai, on the Tutorials page, select YOLOE: See Everything in Real Time, and click Run this Tutorial Online.

2. After the page jumps, click "Clone" in the upper right corner to clone the tutorial into your own container.

3. Select "NVIDIA RTX 4090" and "PyTorch" images. The OpenBayes platform has launched a new billing method. You can choose "pay as you go" or "daily/weekly/monthly" according to your needs. Click "Continue". New users can register using the invitation link below to get 4 hours of RTX 4090 + 5 hours of CPU free time!

HyperAI exclusive invitation link (copy and open in browser):

https://go.openbayes.com/9S6Dr

4. Wait for resources to be allocated. The first clone will take about 2 minutes. When the status changes to "Running", click the jump arrow next to "API Address" to jump to the Demo page. Due to the large model, it will take about 3 minutes to display the WebUI interface, otherwise "Bad Gateway" will be displayed. Please note that users must complete real-name authentication before using the API address access function.

Effect display

The first is text prompt detection,YOLOE supports text hint detection and segmentation for any text category. The text input in the figure below is "tiger, bus, person". The detection result is shown in the figure on the right, which clearly identifies the tiger, sightseeing bus and tourists in the picture. It can be seen that even tourists with their heads blocked and in the dark are clearly identified.

The second is visual cues.After specifying the detection target by means of boxes/points/hand-drawn shapes/reference images, similar detection objects can be accurately identified, as shown in the following figure:

Finally, there is fully automatic silent detection.It can automatically identify scene objects, as shown in the following figure:

The above is the tutorial recommended by HyperAI this time. Come and try it out for yourself!

Tutorial Link:

https://go.hyper.ai/U2PXt