HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Referring Video Object Segmentation
Referring Video Object Segmentation On Refer
Referring Video Object Segmentation On Refer
Métriques
F
J
Ju0026F
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
F
J
Ju0026F
Paper Title
Repository
HTML-Video-SwinT
63.0
59.5
61.2
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
HTR
68.9
65.3
67.1
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
VLT
65.6
61.9
63.8
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
GLEE-Plus
69.7
65.6
67.7
General Object Foundation Model for Images and Videos at Scale
HTML-SwinL
65.3
61.5
63.4
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
ReferFormer (Large)
64.6
61.3
62.9
Language as Queries for Referring Video Object Segmentation
SOC
67.9
64.1
66.0
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
HTML-Video-SwinB
65.2
61.5
63.4
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
HTML-ResNet101
59.8
57.3
58.5
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
HTML-ResNet50
59.0
56.5
57.8
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
VATEX
67.5
63.3
65.4
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
CMSA
38.1
34.8
36.4
Cross-Modal Self-Attention Network for Referring Image Segmentation
SgMg
67.4
63.9
65.7
Spectrum-guided Multi-granularity Referring Video Object Segmentation
GLEE-Pro
72.9
68.2
70.6
General Object Foundation Model for Images and Videos at Scale
FindTrack
72.0
68.6
70.3
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
HyperSeg
-
-
68.5
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
R2VOS (Swin-T)
61.5
58.9
60.2
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
HTML-Video-SwinS
62.9
59.9
61.4
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
-
0 of 18 row(s) selected.
Previous
Next