HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
オブジェクト検出
Object Detection On Coco Minival
Object Detection On Coco Minival
評価指標
AP50
AP75
APL
APM
APS
box AP
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
AP50
AP75
APL
APM
APS
box AP
Paper Title
Co-DETR
-
-
-
-
-
65.9
DETRs with Collaborative Hybrid Assignments Training
M3I Pre-training (InternImage-H)
-
-
-
-
-
65.0
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
InternImage-H
-
-
-
-
-
65.0
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Co-DETR (Swin-L)
-
-
-
-
-
64.7
DETRs with Collaborative Hybrid Assignments Training
Focal-Stable-DINO (Focal-Huge, no TTA)
81.5
71.4
78.5
68.5
50.4
64.6
A Strong and Reproducible Object Detector with Only Public Datasets
EVA
82.1
70.8
78.5
68.4
49.4
64.5
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
ViT-CoMer
-
-
-
-
-
64.3
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
FocalNet-H (DINO)
-
-
-
-
-
64.2
Focal Modulation Networks
InternImage-XL
-
-
-
-
-
64.2
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
CP-DETR-L Swin-L(Fine tuning separately in COCO)
-
-
-
-
-
64.1
CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection
RevCol-H(DINO)
-
-
-
-
-
63.8
Reversible Column Networks
DINO (Swin-L)
-
-
-
-
-
63.2
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Grounding DINO
-
-
-
-
-
63.0
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
SwinV2-G (HTC++)
-
-
-
-
-
62.5
Swin Transformer V2: Scaling Up Capacity and Resolution
GLEE-Pro
-
-
-
-
-
62.0
General Object Foundation Model for Images and Videos at Scale
Florence-CoSwin-H
-
-
-
-
-
62
Florence: A New Foundation Model for Computer Vision
ViTDet, ViT-H Cascade (multiscale)
-
-
-
-
-
61.3
Exploring Plain Vision Transformer Backbones for Object Detection
GLIP (Swin-L, multi-scale)
-
-
-
-
-
60.8
Grounded Language-Image Pre-training
Soft Teacher + Swin-L (HTC++, multi-scale)
-
-
-
-
-
60.7
End-to-End Semi-Supervised Object Detection with Soft Teacher
UNINEXT-H
77.5
66.7
75.3
64.8
45.1
60.6
Universal Instance Perception as Object Discovery and Retrieval
0 of 219 row(s) selected.
Previous
Next
Object Detection On Coco Minival | SOTA | HyperAI超神経