HyperAI

Panoptic Segmentation On Coco Minival

Metrics

AP
PQ

Results

Performance results of various models on this benchmark

Model Name
AP
PQ
Paper TitleRepository
FocalNet-L (Mask2Former (200 queries))48.457.9Focal Modulation Networks
OpenSeeD (SwinL, single-scale)53.259.5A Simple Framework for Open-Vocabulary Segmentation and Detection
Visual Attention Network (VAN-B6 + Mask2Former)-58.2Visual Attention Network
Panoptic FCN* (ResNet-50-FPN)-44.3Fully Convolutional Networks for Panoptic Segmentation
MaskFormer (single-scale)-52.7Per-Pixel Classification is Not All You Need for Semantic Segmentation
kMaX-DeepLab (single-scale, drop query with 256 queries)-58.0kMaX-DeepLab: k-means Mask Transformer
DETR-R101 (ResNet-101)3345.1End-to-End Object Detection with Transformers
Axial-DeepLab-L (multi-scale)-43.9Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
PanopticFPN++39.744.1End-to-End Object Detection with Transformers
kMaX-DeepLab (single-scale, pseudo-labels)-58.1kMaX-DeepLab: k-means Mask Transformer
MasK DINO (SwinL,single-scale)50.959.4Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
OneFormer (InternImage-H,single-scale)52.060.0OneFormer: One Transformer to Rule Universal Image Segmentation
Axial-DeepLab-L(multi-scale)--Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
kMaX-DeepLab (single-scale)-57.9kMaX-DeepLab: k-means Mask Transformer
DiNAT-L (single-scale, Mask2Former)49.258.5Dilated Neighborhood Attention Transformer
OneFormer (Swin-L, single-scale)49.057.9OneFormer: One Transformer to Rule Universal Image Segmentation
HyperSeg (Swin-B)-61.2HyperSeg: Towards Universal Visual Segmentation with Large Language Model
Axial-DeepLab-L (single-scale)-43.4Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
OneFormer (DiNAT-L, single-scale)49.258.0OneFormer: One Transformer to Rule Universal Image Segmentation
ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)48.958.4Vision Transformer Adapter for Dense Predictions
0 of 28 row(s) selected.