HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
セマンティックセグメンテーション
Semantic Segmentation On Nyu Depth V2
Semantic Segmentation On Nyu Depth V2
評価指標
Mean IoU
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Mean IoU
Paper Title
OmniVec2
63.6
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
DiffusionMMS (DAT++-S)
61.5
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer
GeminiFusion (Swin-Large)
60.9
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
OmniVec
60.8
OmniVec: Learning robust representations with cross modal sharing
GeminiFusion (Swin-Large)
60.2
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
DPLNet
59.3
Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning
EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)
59.02
PanopticNDT: Efficient and Robust Panoptic Mapping
SwinMTL
58.14%
SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images
PolyMaX(ConvNeXt-L)
58.08%
PolyMaX: General Dense Prediction with Mask Transformer
HSPFormer(PVT v2-B4)
57.8%
HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation
GeminiFusion (MiT-B5)
57.7
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
DFormer-L
57.2%
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
CMNeXt (B4)
56.9%
Delivering Arbitrary-Modal Semantic Segmentation
CMX (B5)
56.9%
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
GeminiFusion (MiT-B3)
56.8
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
OMNIVORE (Swin-L, finetuned)
56.8%
Omnivore: A Single Model for Many Visual Modalities
CMX (B4)
56.3%
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
MultiMAE (ViT-B)
56.0%
MultiMAE: Multi-modal Multi-task Masked Autoencoders
SMMCL (SegNeXt-B)
55.8%
Understanding Dark Scenes by Contrasting Multi-Modal Observations
DFormer-B
55.6%
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
0 of 116 row(s) selected.
Previous
Next