HyperAIHyperAI

Instance Segmentation On Coco

Metrics

AP50
AP75
APL
APM
APS
mask AP

Results

Performance results of various models on this benchmark

Model Name
AP50
AP75
APL
APM
APS
mask AP
Paper TitleRepository
CenterMask + VoVNetV2-99 (single-scale)62.344.157.042.820.140.6CenterMask : Real-Time Anchor-Free Instance Segmentation-
MasK DINO (SwinL, multi-scale)-----54.7Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation-
ISDA (ours)6241.1-41.21738.7ISDA: Position-Aware Instance Segmentation with Deformable Attention-
EmbedMask(R-101-FPN)59.140.3-40.417.937.7EmbedMask: Embedding Coupling for One-stage Instance Segmentation-
PANet-----42.0Path Aggregation Network for Instance Segmentation-
GLEE-Lite-----48.3General Object Foundation Model for Images and Videos at Scale-
VirTex Mask R-CNN (ResNet-50-FPN)58.439.7---36.9VirTex: Learning Visual Representations from Textual Annotations-
iBOT (ViT-B/16)-----44.2iBOT: Image BERT Pre-Training with Online Tokenizer-
DiffusionInst-ResNet101-----41.5DiffusionInst: Diffusion Model for Instance Segmentation-
Cascade Mask R-CNN (ResNeXt152, CBNet)-----43.3CBNet: A Novel Composite Backbone Network Architecture for Object Detection-
ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)-----53.0Vision Transformer Adapter for Dense Predictions-
PolarMask (ResNet-101-FPN)51.9%31%42.8%32.4%13.4%30.4%PolarMask: Single Shot Instance Segmentation with Polar Representation-
Co-DETR80.263.472.060.141.657.1DETRs with Collaborative Hybrid Assignments Training-
gSwin-S-----45.03gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window-
MogaNet-B (Cascade Mask R-CNN)-----46MogaNet: Multi-order Gated Aggregation Network-
DetectoRS (ResNeXt-101-32x4d, multi-scale)71.151.659.649.530.347.1DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution-
VoVNetV1-57-----40.8%An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection-
GCNet (ResNeXt-101 + DCN + cascade + GC r16)-----41.5%GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond-
SOLQ (ResNet50, single scale)-----39.7SOLQ: Segmenting Objects by Learning Queries-
dBOT ViT-B (CLIP)-----46.2Exploring Target Representations for Masked Autoencoders-
0 of 112 row(s) selected.
Instance Segmentation On Coco | SOTA | HyperAI