HyperAI超神经

Instance Segmentation On Coco

评估指标

AP50
AP75
APL
APM
APS
mask AP

评测结果

各个模型在此基准测试上的表现结果

模型名称
AP50
AP75
APL
APM
APS
mask AP
Paper TitleRepository
CenterMask + VoVNetV2-99 (single-scale)62.344.157.042.820.140.6CenterMask : Real-Time Anchor-Free Instance Segmentation
MasK DINO (SwinL, multi-scale)-----54.7Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ISDA (ours)6241.1-41.21738.7ISDA: Position-Aware Instance Segmentation with Deformable Attention
EmbedMask(R-101-FPN)59.140.3-40.417.937.7EmbedMask: Embedding Coupling for One-stage Instance Segmentation
PANet-----42.0Path Aggregation Network for Instance Segmentation
GLEE-Lite-----48.3General Object Foundation Model for Images and Videos at Scale
VirTex Mask R-CNN (ResNet-50-FPN)58.439.7---36.9VirTex: Learning Visual Representations from Textual Annotations
iBOT (ViT-B/16)-----44.2iBOT: Image BERT Pre-Training with Online Tokenizer
DiffusionInst-ResNet101-----41.5DiffusionInst: Diffusion Model for Instance Segmentation
Cascade Mask R-CNN (ResNeXt152, CBNet)-----43.3CBNet: A Novel Composite Backbone Network Architecture for Object Detection
ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)-----53.0Vision Transformer Adapter for Dense Predictions
PolarMask (ResNet-101-FPN)51.9%31%42.8%32.4%13.4%30.4%PolarMask: Single Shot Instance Segmentation with Polar Representation
Co-DETR80.263.472.060.141.657.1DETRs with Collaborative Hybrid Assignments Training
gSwin-S-----45.03gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window-
MogaNet-B (Cascade Mask R-CNN)-----46MogaNet: Multi-order Gated Aggregation Network
DetectoRS (ResNeXt-101-32x4d, multi-scale)71.151.659.649.530.347.1DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
VoVNetV1-57-----40.8%An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection
GCNet (ResNeXt-101 + DCN + cascade + GC r16)-----41.5%GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
SOLQ (ResNet50, single scale)-----39.7SOLQ: Segmenting Objects by Learning Queries
dBOT ViT-B (CLIP)-----46.2Exploring Target Representations for Masked Autoencoders
0 of 112 row(s) selected.