Semantic Segmentation On Cityscapes Val

평가 지표

mIoU

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	mIoU	Paper Title	Repository
DCT-EDANet	61.6	Exploring Semantic Segmentation on the DCT Representation	-
PatchDiverse + Swin-L (multi-scale test, upernet, ImageNet22k pretrain)	83.6%	Vision Transformers with Patch Diversification
DetCon_B	77.0%	Efficient Visual Pretraining with Contrastive Detection
StreamDEQ (8 iterations)	78.2	Representation Recycling for Streaming Video Analysis
SETR-PUP (80k, MS)	82.15	Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Soft Labells (HRnet)	84.8	Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
FasterSeg	73.1%	FasterSeg: Searching for Faster Real-time Semantic Segmentation
HRNetV2 (HRNetV2-W40)	80.2	Deep High-Resolution Representation Learning for Visual Recognition
StreamDEQ (2 iterations)	57.9	Representation Recycling for Streaming Video Analysis
Dilated-ResNet (Dilated-ResNet-101)	75.7	Deep Residual Learning for Image Recognition
VPNeXt	84.4	VPNeXt -- Rethinking Dense Decoding for Plain Vision Transformer	-
GSCNN (ResNet-50)	73.0%	Gated-SCNN: Gated Shape CNNs for Semantic Segmentation
FAN-L-Hybrid	82.3	Understanding The Robustness in Vision Transformers
EfficientViT-B3 (r1184x2368)	83.2	EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Trans4Trans	81.54%	Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
Aerial-PASS (ResNet-18)	72.8%	Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos	-
DSNet(single-scale)	80.4	DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation
RepVGG-B2	80.57%	RepVGG: Making VGG-style ConvNets Great Again
InternImage-H	87	InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
ViT-Adapter-L	85.8	Vision Transformer Adapter for Dense Predictions

0 of 97 row(s) selected.