HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
서비스 약관
개인정보 처리방침
한국어
HyperAI
HyperAI초신경
Toggle Sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
플랫폼
홈
SOTA
세마틱 세그멘테이션
Semantic Segmentation On Nyu Depth V2
Semantic Segmentation On Nyu Depth V2
평가 지표
Mean IoU
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Mean IoU
Paper Title
OmniVec2
63.6
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
DiffusionMMS (DAT++-S)
61.5
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer
GeminiFusion (Swin-Large)
60.9
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
OmniVec
60.8
OmniVec: Learning robust representations with cross modal sharing
GeminiFusion (Swin-Large)
60.2
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
DPLNet
59.3
Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning
EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)
59.02
PanopticNDT: Efficient and Robust Panoptic Mapping
SwinMTL
58.14%
SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images
PolyMaX(ConvNeXt-L)
58.08%
PolyMaX: General Dense Prediction with Mask Transformer
HSPFormer(PVT v2-B4)
57.8%
HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation
GeminiFusion (MiT-B5)
57.7
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
DFormer-L
57.2%
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
CMNeXt (B4)
56.9%
Delivering Arbitrary-Modal Semantic Segmentation
CMX (B5)
56.9%
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
GeminiFusion (MiT-B3)
56.8
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
OMNIVORE (Swin-L, finetuned)
56.8%
Omnivore: A Single Model for Many Visual Modalities
CMX (B4)
56.3%
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
MultiMAE (ViT-B)
56.0%
MultiMAE: Multi-modal Multi-task Masked Autoencoders
SMMCL (SegNeXt-B)
55.8%
Understanding Dark Scenes by Contrasting Multi-Modal Observations
DFormer-B
55.6%
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
0 of 116 row(s) selected.
Previous
Next