HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Video Instance Segmentation
Video Instance Segmentation On Youtube Vis 2
Video Instance Segmentation On Youtube Vis 2
Metrics
AP50
AP75
AR1
AR10
mask AP
Results
Performance results of various models on this benchmark
Columns
Model Name
AP50
AP75
AR1
AR10
mask AP
Paper Title
CAVIS(VIT-L, Offline)
87.3
73.2
49.7
70.3
65.3
Context-Aware Video Instance Segmentation
DVIS++(VIT-L, Offline)
86.7
71.5
48.8
69.5
63.9
DVIS++: Improved Decoupled Framework for Universal Video Segmentation
DVIS-DAQ(VIT-L, Offline)
86.1
72.2
49.6
70.7
64.5
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries
RefineVIS (Swin-L, online)
84.1
68.5
48.3
65.2
61.4
RefineVIS: Video Instance Segmentation with Temporal Attention Refinement
DVIS(Swin-L)
83.0
68.4
47.7
65.7
60.1
DVIS: Decoupled Video Instance Segmentation Framework
DVIS++(VIT-L, Online)
82.7
70.2
49.5
68.0
62.3
DVIS++: Improved Decoupled Framework for Universal Video Segmentation
NOVIS (Swin-L)
82.0
66.5
47.9
64.4
59.8
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation
TarViS (Swin-L)
81.4
67.6
47.6
64.8
60.2
TarViS: A Unified Approach for Target-based Video Segmentation
GRAtt-VIS (Swin-L)
81.3
67.1
48.8
64.5
60.3
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation
GenVIS (Swin-L)
80.9
66.5
49.1
64.7
60.1
A Generalized Framework for Video Instance Segmentation
IDOL (Swin-L)
80.8
63.5
45
60.1
56.1
In Defense of Online Models for Video Instance Segmentation
MDQE(Swin-L)
80.7
61.7
45.4
60.6
55.5
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
VITA (Swin-L)
80.6
61.0
47.7
62.6
57.5
VITA: Video Instance Segmentation via Object Token Association
UniVS(Swin-L)
79.4
63.3
46.2
63.1
57.9
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Tube-Link(Swin-L)
79.4
64.3
47.5
63.6
58.4
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation
DeVIS (Swin-L)
77.7
59.8
43.8
57.8
54.4
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation
MinVIS (Swin-L)
76.6
62
45.9
60.8
55.3
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
BoxVIS(Swin-L & Box-sup)
76.4
59.6
44.8
61.0
53.9
BoxVIS: Video Instance Segmentation with Box Annotations
InstanceFormer (Swin-L)
73.7
56.9
42.8
56.0
51.0
InstanceFormer: An Online Video Instance Segmentation Framework
TarViS (Swin-T)
71.6
56.6
42.2
57.2
50.9
TarViS: A Unified Approach for Target-based Video Segmentation
0 of 26 row(s) selected.
Previous
Next
Video Instance Segmentation On Youtube Vis 2 | SOTA | HyperAI