Open Vocabulary Object Detection On Lvis V1 0

评估指标

AP novel-LVIS base training

评测结果

各个模型在此基准测试上的表现结果

模型名称
AP novel-LVIS base training
Paper TitleRepository
LaMI-DETR43.4LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction-
RO-ViT32.1Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers-
OADP21.7Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection-
X-Paste21.4X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion-
Detic17.8Detecting Twenty-thousand Classes using Image-level Supervision-
OVMR34.4OVMR: Open-Vocabulary Recognition with Multi-Modal References-
POMP25.2Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition-
OWL-ViT (CLIP-L/14)25.6Simple Open-Vocabulary Object Detection with Vision Transformers-
Region-CLIP (RN50-C4)17.1RegionCLIP: Region-based Language-Image Pretraining-
Object-Centric-OVD21.1Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection-
ViLD (R50-FPN)16.1Open-vocabulary Object Detection via Vision and Language Knowledge Distillation-
CoDet (EVA02-L)37.0CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection-
CLIPSelf34.9CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction-
DITO40.4Region-centric Image-Language Pretraining for Open-Vocabulary Detection-
CLIM (RN50x64)32.3CLIM: Contrastive Language-Image Mosaic for Region Representation-
ViLD-ensemble (R50-FPN)16.6Open-vocabulary Object Detection via Vision and Language Knowledge Distillation-
OV-DQUO(ViT-L/14)39.3OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision-
ViLD-ensemble (R152-FPN)18.7Open-vocabulary Object Detection via Vision and Language Knowledge Distillation-
MEDet22.4Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization-
OV-DQUO(ViT-B/16)29.7OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision-
0 of 28 row(s) selected.
Open Vocabulary Object Detection On Lvis V1 0 | SOTA | HyperAI超神经