HyperAI

Object Detection On Coco Minival

Métriques

AP50
AP75
APL
APM
APS
box AP

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
AP50
AP75
APL
APM
APS
box AP
Paper TitleRepository
ResNeSt-200 (multi-scale)71.0057.0766.2956.3636.8052.47ResNeSt: Split-Attention Networks
ExtremeNet (Hourglass-104, single-scale)55.143.756.144.021.640.3Bottom-up Object Detection by Grouping Extreme and Center Points
Mask R-CNN (ResNeXt-101-FPN)59.538.9---36.7Mask R-CNN
FPN+61.343.352.643.322.939.8Feature Pyramid Networks for Object Detection
Hiera-L-----55Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MOAT-2 (IN-22K pretraining, single-scale)-----58.5MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Focal-Stable-DINO (Focal-Huge, no TTA)81.571.478.568.550.464.6A Strong and Reproducible Object Detector with Only Public Datasets
BoTNet 152 (Mask R-CNN, single scale, 72 epochs)7154.2---49.5Bottleneck Transformers for Visual Recognition
ResNeSt-200-DCN (single-scale)69.5355.4065.8354.6632.6750.91ResNeSt: Split-Attention Networks
Cascade R-CNN (ResNet-101-FPN+, cascade)61.646.657.446.223.842.7Cascade R-CNN: Delving into High Quality Object Detection
Mask R-CNN (ResNet-101 + 1 NL)63.144.5---40.8Non-local Neural Networks
DETR-DC5 (ResNet-101)64.747.762.349.523.744.9End-to-End Object Detection with Transformers
XCiT-S24/8-----48.1XCiT: Cross-Covariance Image Transformers
DETR-ResNet50 with iRPE-K (150 epochs)-----40.8Rethinking and Improving Relative Position Encoding for Vision Transformer
SwinV2-G (HTC++)-----62.5Swin Transformer V2: Scaling Up Capacity and Resolution
Mask R-CNN (ResNet-101, DCNv2)-----43.1Deformable ConvNets v2: More Deformable, Better Results
GCnet (ResNet-50-FPN, GRoIE)62.44452.544.424.240.3GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
RPDet (ResNeXt-101-DCN, multi-scale)-----46.8RepPoints: Point Set Representation for Object Detection
FoveaBox+aLRP Loss (ResNet-50, 500 scale)58.841.5---39.7A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection
Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)60.743.3---40.7A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection
0 of 219 row(s) selected.