HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Semantic Segmentation
Semantic Segmentation On Cityscapes Val
Semantic Segmentation On Cityscapes Val
Metrics
mIoU
Results
Performance results of various models on this benchmark
Columns
Model Name
mIoU
Paper Title
SERNet-Former
87.35
SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
MetaPrompt-SD
87.1
Harnessing Diffusion Models for Visual Perception with Meta Prompts
InternImage-H
87
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
HRNetV2-OCR+PSA
86.93
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
InternImage-XL
86.4
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
HRNet-OCR
86.3
Hierarchical Multi-Scale Attention for Semantic Segmentation
Depth Anything
86.2
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
ViT-Adapter-L
85.8
Vision Transformer Adapter for Dense Predictions
OneFormer (ConvNeXt-XL, Mapillary, multi-scale)
85.8
OneFormer: One Transformer to Rule Universal Image Segmentation
SeMask (SeMask Swin-L Mask2Former)
84.98
SeMask: Semantically Masked Transformers for Semantic Segmentation
Soft Labells (HRnet)
84.8
Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
Sequential Ensemble (MiT-B5 + HRNet)
84.8
Sequential Ensembling for Semantic Segmentation
OneFormer (ConvNeXt-XL, multi-scale)
84.6
OneFormer: One Transformer to Rule Universal Image Segmentation
DiNAT-L (Mask2Former)
84.5
Dilated Neighborhood Attention Transformer
VPNeXt
84.4
VPNeXt -- Rethinking Dense Decoding for Plain Vision Transformer
OneFormer (Swin-L, multi-scale)
84.4
OneFormer: One Transformer to Rule Universal Image Segmentation
Mask2Former (Swin-L)
84.3
Masked-attention Mask Transformer for Universal Image Segmentation
VOLO-D4 (MS, ImageNet1k pretrain)
84.3
VOLO: Vision Outlooker for Visual Recognition
SegFormer (MiT-B5, Mapillary)
84.0
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
DDP (ConvNeXt-L, step-3)
83.9
DDP: Diffusion Model for Dense Visual Prediction
0 of 97 row(s) selected.
Previous
Next