HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Open Vocabulary Object Detection
Open Vocabulary Object Detection On Lvis V1 0
Open Vocabulary Object Detection On Lvis V1 0
Metrics
AP novel-LVIS base training
Results
Performance results of various models on this benchmark
Columns
Model Name
AP novel-LVIS base training
Paper Title
LaMI-DETR
43.4
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
DITO
40.4
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
OV-DQUO(ViT-L/14)
39.3
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
CoDet (EVA02-L)
37.0
CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
CLIPSelf
34.9
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
OVMR
34.4
OVMR: Open-Vocabulary Recognition with Multi-Modal References
DE-ViT
34.3
Detect Everything with Few Examples
CFM-ViT
33.9
Contrastive Feature Masking Open-Vocabulary Vision Transformer
CLIM (RN50x64)
32.3
CLIM: Contrastive Language-Image Mosaic for Region Representation
RO-ViT
32.1
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Prova (Swin-Base)
31.5
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection
RTGen
30.2
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
OV-DQUO(ViT-B/16)
29.7
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
ViLD-ensemble w/ ALIGN (Eb7-FPN)
26.3
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
OWL-ViT (CLIP-L/14)
25.6
Simple Open-Vocabulary Object Detection with Vision Transformers
POMP
25.2
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
BARON
22.6
Aligning Bag of Regions for Open-Vocabulary Object Detection
MEDet
22.4
Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization
Region-CLIP (RN50x4-C4)
22.0
RegionCLIP: Region-based Language-Image Pretraining
RALF
21.9
Retrieval-Augmented Open-Vocabulary Object Detection
0 of 28 row(s) selected.
Previous
Next
Open Vocabulary Object Detection On Lvis V1 0 | SOTA | HyperAI