HyperAI超神経

Object Detection On Coco Minival

評価指標

AP50
AP75
APL
APM
APS
box AP

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
AP50
AP75
APL
APM
APS
box AP
Paper TitleRepository
ResNeSt-200 (multi-scale)71.0057.0766.2956.3636.8052.47ResNeSt: Split-Attention Networks
ExtremeNet (Hourglass-104, single-scale)55.143.756.144.021.640.3Bottom-up Object Detection by Grouping Extreme and Center Points
Mask R-CNN (ResNeXt-101-FPN)59.538.9---36.7Mask R-CNN
FPN+61.343.352.643.322.939.8Feature Pyramid Networks for Object Detection
Hiera-L-----55Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MOAT-2 (IN-22K pretraining, single-scale)-----58.5MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Focal-Stable-DINO (Focal-Huge, no TTA)81.571.478.568.550.464.6A Strong and Reproducible Object Detector with Only Public Datasets
BoTNet 152 (Mask R-CNN, single scale, 72 epochs)7154.2---49.5Bottleneck Transformers for Visual Recognition
ResNeSt-200-DCN (single-scale)69.5355.4065.8354.6632.6750.91ResNeSt: Split-Attention Networks
Cascade R-CNN (ResNet-101-FPN+, cascade)61.646.657.446.223.842.7Cascade R-CNN: Delving into High Quality Object Detection
Mask R-CNN (ResNet-101 + 1 NL)63.144.5---40.8Non-local Neural Networks
DETR-DC5 (ResNet-101)64.747.762.349.523.744.9End-to-End Object Detection with Transformers
XCiT-S24/8-----48.1XCiT: Cross-Covariance Image Transformers
DETR-ResNet50 with iRPE-K (150 epochs)-----40.8Rethinking and Improving Relative Position Encoding for Vision Transformer
SwinV2-G (HTC++)-----62.5Swin Transformer V2: Scaling Up Capacity and Resolution
Mask R-CNN (ResNet-101, DCNv2)-----43.1Deformable ConvNets v2: More Deformable, Better Results
GCnet (ResNet-50-FPN, GRoIE)62.44452.544.424.240.3GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
RPDet (ResNeXt-101-DCN, multi-scale)-----46.8RepPoints: Point Set Representation for Object Detection
FoveaBox+aLRP Loss (ResNet-50, 500 scale)58.841.5---39.7A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection
Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)60.743.3---40.7A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection
0 of 219 row(s) selected.