HyperAI超神经

Object Detection On Coco 2017

评估指标

mAP

评测结果

各个模型在此基准测试上的表现结果

模型名称
mAP
Paper TitleRepository
UniRepLKNet-S++54.3UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
BiFormer-B (IN1k pretrain, MaskRCNN 12ep)48.6BiFormer: Vision Transformer with Bi-Level Routing Attention
DyHead (SAP)-Stochastic Subsampling With Average Pooling-
Lpixel-Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
MaxViT-T-MaxViT: Multi-Axis Vision Transformer
YOLO-Drone35.45YOLO-Drone:Airborne real-time detection of dense small objects from high-altitude perspective-
UniRepLKNet-T51.7UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-B++54.8UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
DAT-T++-DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DeBiFormer-B (IN1k pretrain, MaskRCNN 12ep)48.5DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention-
MaxViT-S-MaxViT: Multi-Axis Vision Transformer
MixMIM-B52.2MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Faster R-CNN (ideal number of groups)-On the Ideal Number of Groups for Isometric Gradient Propagation-
BiFormer-S (IN1k pretrain, MaskRCNN 12ep)47.8BiFormer: Vision Transformer with Bi-Level Routing Attention
UniRepLKNet-S53UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
DeBiFormer-S (IN1k pretrain, MaskRCNN 12ep)47.5DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention-
DAT-S++-DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
retinanet-Benchmark for Generic Product Detection: A Low Data Baseline for Dense Object Detection
UniRepLKNet-XL++56.4UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
DeBiFormer-B (IN1k pretrain, Retina)47.1DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention-
0 of 24 row(s) selected.