HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
تصنيف الصوت
Audio Classification On Audioset
Audio Classification On Audioset
المقاييس
Test mAP
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Test mAP
Paper Title
OmniVec2
0.558
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
OmniVec
0.548
OmniVec: Learning robust representations with cross modal sharing
EquiAV
0.546
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
MAViL (Audio-Visual, single)
0.533
-
Audiovisual Masked Autoencoder (Audiovisual, Single)
0.518
Audiovisual Masked Autoencoders
CAV-MAE (Audio-Visual)
0.512
Contrastive Audio-Visual Masked Autoencoder
BEATs (Audio-only, Ensemble)
0.506
BEATs: Audio Pre-Training with Acoustic Tokenizers
UAVM (Audio + Video)
0.504
UAVM: Towards Unifying Audio and Visual Models
SSLAM (Audio-Only, Single)
0.502
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
mn40_as (Ensemble)
0.498
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
ATST-C2F(Single)
0.497
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
MBT (AS-500K training + Video)
0.496
Attention Bottlenecks for Multimodal Fusion
PaSST (Ensemble)
0.496
Efficient Training of Audio Transformers with Patchout
DyMN-L (Audio-Only, Single)
0.490
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
HTS-AT (Ensemble)
0.487
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
EAT
0.486
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
BEATs (Audio-only, Single)
0.486
BEATs: Audio Pre-Training with Acoustic Tokenizers
DTF-AT (Single)
0.486
DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
M2D-AS/0.7
0.485
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
AST (Ensemble)
0.485
AST: Audio Spectrogram Transformer
0 of 50 row(s) selected.
Previous
Next
Audio Classification On Audioset | SOTA | HyperAI