Object Detection On Coco Minival

المقاييس

AP50

AP75

APL

APM

APS

box AP

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

اسم النموذج	AP50	AP75	APL	APM	APS	box AP	Paper Title
ResNeSt-200 (multi-scale)	71.00	57.07	66.29	56.36	36.80	52.47	ResNeSt: Split-Attention Networks
ExtremeNet (Hourglass-104, single-scale)	55.1	43.7	56.1	44.0	21.6	40.3	Bottom-up Object Detection by Grouping Extreme and Center Points
Mask R-CNN (ResNeXt-101-FPN)	59.5	38.9	-	-	-	36.7	Mask R-CNN
FPN+	61.3	43.3	52.6	43.3	22.9	39.8	Feature Pyramid Networks for Object Detection
Hiera-L	-	-	-	-	-	55	Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MOAT-2 (IN-22K pretraining, single-scale)	-	-	-	-	-	58.5	MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Focal-Stable-DINO (Focal-Huge, no TTA)	81.5	71.4	78.5	68.5	50.4	64.6	A Strong and Reproducible Object Detector with Only Public Datasets
BoTNet 152 (Mask R-CNN, single scale, 72 epochs)	71	54.2	-	-	-	49.5	Bottleneck Transformers for Visual Recognition
ResNeSt-200-DCN (single-scale)	69.53	55.40	65.83	54.66	32.67	50.91	ResNeSt: Split-Attention Networks
Cascade R-CNN (ResNet-101-FPN+, cascade)	61.6	46.6	57.4	46.2	23.8	42.7	Cascade R-CNN: Delving into High Quality Object Detection
Mask R-CNN (ResNet-101 + 1 NL)	63.1	44.5	-	-	-	40.8	Non-local Neural Networks
DETR-DC5 (ResNet-101)	64.7	47.7	62.3	49.5	23.7	44.9	End-to-End Object Detection with Transformers
XCiT-S24/8	-	-	-	-	-	48.1	XCiT: Cross-Covariance Image Transformers
DETR-ResNet50 with iRPE-K (150 epochs)	-	-	-	-	-	40.8	Rethinking and Improving Relative Position Encoding for Vision Transformer
SwinV2-G (HTC++)	-	-	-	-	-	62.5	Swin Transformer V2: Scaling Up Capacity and Resolution
Mask R-CNN (ResNet-101, DCNv2)	-	-	-	-	-	43.1	Deformable ConvNets v2: More Deformable, Better Results
GCnet (ResNet-50-FPN, GRoIE)	62.4	44	52.5	44.4	24.2	40.3	GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
RPDet (ResNeXt-101-DCN, multi-scale)	-	-	-	-	-	46.8	RepPoints: Point Set Representation for Object Detection
FoveaBox+aLRP Loss (ResNet-50, 500 scale)	58.8	41.5	-	-	-	39.7	A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection
Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)	60.7	43.3	-	-	-	40.7	A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection

0 of 219 row(s) selected.