HyperAI
الرئيسية
الأخبار
أحدث الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
العربية
HyperAI
Toggle sidebar
البحث في الموقع...
⌘
K
الرئيسية
SOTA
Audio Captioning
Audio Captioning On Clotho
Audio Captioning On Clotho
المقاييس
BLEU-4
CIDEr
METEOR
ROUGE-L
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
BLEU-4
CIDEr
METEOR
ROUGE-L
Paper Title
Repository
VALOR
16.2
0.423
17.4
38.2
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
RNN-GRU-EncDec + VGGish + Word2Vec
-
0.18
-
-
Audio Captioning using Gated Recurrent Units
-
VAST
19
0.519
19.3
40.8
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Ensemble-RL
-
0.468
-
-
THE SJTU SYSTEM FOR DCASE2021 CHALLENGE TASK 6: AUDIO CAPTIONING BASED ON ENCODER PRE-TRAINING AND REINFORCEMENT LEARNING
Ensemble
-
0.400
-
-
THE DCASE 2021 CHALLENGE TASK 6 SYSTEM: AUTOMATED AUDIO CAPTIONING WITH WEAKLY SUPERVISED PRE-TRAING AND WORD SELECTION METHODS
-
Ensemble
-
0.319
-
-
The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation
-
Qwen-Audio
-
0.441
-
-
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
SLAM-AAC
-
0.515
0.197
-
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
LOAE
-
0.513
0.197
-
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding
Audio Flamingo (Pengi trainset)
17.4
0.489
18.7
39.4
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
0 of 10 row(s) selected.
Previous
Next