Instance Segmentation On Coco

评估指标

AP50

AP75

APL

APM

APS

mask AP

评测结果

各个模型在此基准测试上的表现结果

模型名称	AP50	AP75	APL	APM	APS	mask AP	Paper Title	Repository
CenterMask + VoVNetV2-99 (single-scale)	62.3	44.1	57.0	42.8	20.1	40.6	CenterMask : Real-Time Anchor-Free Instance Segmentation
MasK DINO (SwinL, multi-scale)	-	-	-	-	-	54.7	Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ISDA (ours)	62	41.1	-	41.2	17	38.7	ISDA: Position-Aware Instance Segmentation with Deformable Attention
EmbedMask(R-101-FPN)	59.1	40.3	-	40.4	17.9	37.7	EmbedMask: Embedding Coupling for One-stage Instance Segmentation
PANet	-	-	-	-	-	42.0	Path Aggregation Network for Instance Segmentation
GLEE-Lite	-	-	-	-	-	48.3	General Object Foundation Model for Images and Videos at Scale
VirTex Mask R-CNN (ResNet-50-FPN)	58.4	39.7	-	-	-	36.9	VirTex: Learning Visual Representations from Textual Annotations
iBOT (ViT-B/16)	-	-	-	-	-	44.2	iBOT: Image BERT Pre-Training with Online Tokenizer
DiffusionInst-ResNet101	-	-	-	-	-	41.5	DiffusionInst: Diffusion Model for Instance Segmentation
Cascade Mask R-CNN (ResNeXt152, CBNet)	-	-	-	-	-	43.3	CBNet: A Novel Composite Backbone Network Architecture for Object Detection
ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)	-	-	-	-	-	53.0	Vision Transformer Adapter for Dense Predictions
PolarMask (ResNet-101-FPN)	51.9%	31%	42.8%	32.4%	13.4%	30.4%	PolarMask: Single Shot Instance Segmentation with Polar Representation
Co-DETR	80.2	63.4	72.0	60.1	41.6	57.1	DETRs with Collaborative Hybrid Assignments Training
gSwin-S	-	-	-	-	-	45.03	gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window	-
MogaNet-B (Cascade Mask R-CNN)	-	-	-	-	-	46	MogaNet: Multi-order Gated Aggregation Network
DetectoRS (ResNeXt-101-32x4d, multi-scale)	71.1	51.6	59.6	49.5	30.3	47.1	DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
VoVNetV1-57	-	-	-	-	-	40.8%	An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection
GCNet (ResNeXt-101 + DCN + cascade + GC r16)	-	-	-	-	-	41.5%	GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
SOLQ (ResNet50, single scale)	-	-	-	-	-	39.7	SOLQ: Segmenting Objects by Learning Queries
dBOT ViT-B (CLIP)	-	-	-	-	-	46.2	Exploring Target Representations for Masked Autoencoders

0 of 112 row(s) selected.