HyperAI

Instance Segmentation On Coco

Métriques

AP50
AP75
APL
APM
APS
mask AP

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
AP50
AP75
APL
APM
APS
mask AP
Paper TitleRepository
CenterMask + VoVNetV2-99 (single-scale)62.344.157.042.820.140.6CenterMask : Real-Time Anchor-Free Instance Segmentation
MasK DINO (SwinL, multi-scale)-----54.7Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ISDA (ours)6241.1-41.21738.7ISDA: Position-Aware Instance Segmentation with Deformable Attention
EmbedMask(R-101-FPN)59.140.3-40.417.937.7EmbedMask: Embedding Coupling for One-stage Instance Segmentation
PANet-----42.0Path Aggregation Network for Instance Segmentation
GLEE-Lite-----48.3General Object Foundation Model for Images and Videos at Scale
VirTex Mask R-CNN (ResNet-50-FPN)58.439.7---36.9VirTex: Learning Visual Representations from Textual Annotations
iBOT (ViT-B/16)-----44.2iBOT: Image BERT Pre-Training with Online Tokenizer
DiffusionInst-ResNet101-----41.5DiffusionInst: Diffusion Model for Instance Segmentation
Cascade Mask R-CNN (ResNeXt152, CBNet)-----43.3CBNet: A Novel Composite Backbone Network Architecture for Object Detection
ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)-----53.0Vision Transformer Adapter for Dense Predictions
PolarMask (ResNet-101-FPN)51.9%31%42.8%32.4%13.4%30.4%PolarMask: Single Shot Instance Segmentation with Polar Representation
Co-DETR80.263.472.060.141.657.1DETRs with Collaborative Hybrid Assignments Training
gSwin-S-----45.03gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window-
MogaNet-B (Cascade Mask R-CNN)-----46MogaNet: Multi-order Gated Aggregation Network
DetectoRS (ResNeXt-101-32x4d, multi-scale)71.151.659.649.530.347.1DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
VoVNetV1-57-----40.8%An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection
GCNet (ResNeXt-101 + DCN + cascade + GC r16)-----41.5%GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
SOLQ (ResNet50, single scale)-----39.7SOLQ: Segmenting Objects by Learning Queries
dBOT ViT-B (CLIP)-----46.2Exploring Target Representations for Masked Autoencoders
0 of 112 row(s) selected.