HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
تصنيف الصوت
Audio Classification On Vggsound
Audio Classification On Vggsound
المقاييس
Top 1 Accuracy
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Top 1 Accuracy
Paper Title
Mirasol3B
69.8
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
ONE-PEACE (Audio-Visual)
68.2
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
MAViL
67.1
-
EquiAV
67.1
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
MMT (Audio-Visual)
66.2
Multiscale Multimodal Transformer for Multimodal Action Recognition
CAV-MAE (Audio-Visual)
65.9
Contrastive Audio-Visual Masked Autoencoder
UAVM (Audio + Video)
65.8
UAVM: Towards Unifying Audio and Visual Models
Audiovisual Masked Autoencoder (Audiovisual, Single)
65.0
Audiovisual Masked Autoencoders
AVT (Audio-Visual)
63.9
AVT: Audio-Video Transformer for Multimodal Action Recognition
ONE-PEACE (Audio-Only)
59.6
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
CAV-MAE (Audio-Only)
59.5
Contrastive Audio-Visual Masked Autoencoder
Audiovisual Masked Autoencoder (Audio-only, Single)
57.2
Audiovisual Masked Autoencoders
MAST (Audio Only)
57.0
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification
UAVM (Audio Only)
56.5
UAVM: Towards Unifying Audio and Visual Models
MMT (Video)
56.1
Multiscale Multimodal Transformer for Multimodal Action Recognition
PlayItBackX3
53.7
Play It Back: Iterative Attention for Audio Recognition
AVT (V)
53.2
AVT: Audio-Video Transformer for Multimodal Action Recognition
MBT (A)
52.3
Attention Bottlenecks for Multimodal Fusion
MBT (V)
51.2
Attention Bottlenecks for Multimodal Fusion
UAVM (Video Only)
49.9
UAVM: Towards Unifying Audio and Visual Models
0 of 21 row(s) selected.
Previous
Next
Audio Classification On Vggsound | SOTA | HyperAI