HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
كشف الأشياء
Object Detection On Coco Minival
Object Detection On Coco Minival
المقاييس
AP50
AP75
APL
APM
APS
box AP
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
AP50
AP75
APL
APM
APS
box AP
Paper Title
Co-DETR
-
-
-
-
-
65.9
DETRs with Collaborative Hybrid Assignments Training
M3I Pre-training (InternImage-H)
-
-
-
-
-
65.0
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
InternImage-H
-
-
-
-
-
65.0
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Co-DETR (Swin-L)
-
-
-
-
-
64.7
DETRs with Collaborative Hybrid Assignments Training
Focal-Stable-DINO (Focal-Huge, no TTA)
81.5
71.4
78.5
68.5
50.4
64.6
A Strong and Reproducible Object Detector with Only Public Datasets
EVA
82.1
70.8
78.5
68.4
49.4
64.5
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
ViT-CoMer
-
-
-
-
-
64.3
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
FocalNet-H (DINO)
-
-
-
-
-
64.2
Focal Modulation Networks
InternImage-XL
-
-
-
-
-
64.2
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
CP-DETR-L Swin-L(Fine tuning separately in COCO)
-
-
-
-
-
64.1
CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection
RevCol-H(DINO)
-
-
-
-
-
63.8
Reversible Column Networks
DINO (Swin-L)
-
-
-
-
-
63.2
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Grounding DINO
-
-
-
-
-
63.0
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
SwinV2-G (HTC++)
-
-
-
-
-
62.5
Swin Transformer V2: Scaling Up Capacity and Resolution
GLEE-Pro
-
-
-
-
-
62.0
General Object Foundation Model for Images and Videos at Scale
Florence-CoSwin-H
-
-
-
-
-
62
Florence: A New Foundation Model for Computer Vision
ViTDet, ViT-H Cascade (multiscale)
-
-
-
-
-
61.3
Exploring Plain Vision Transformer Backbones for Object Detection
GLIP (Swin-L, multi-scale)
-
-
-
-
-
60.8
Grounded Language-Image Pre-training
Soft Teacher + Swin-L (HTC++, multi-scale)
-
-
-
-
-
60.7
End-to-End Semi-Supervised Object Detection with Soft Teacher
UNINEXT-H
77.5
66.7
75.3
64.8
45.1
60.6
Universal Instance Perception as Object Discovery and Retrieval
0 of 219 row(s) selected.
Previous
Next
Object Detection On Coco Minival | SOTA | HyperAI