HyperAIHyperAI초신경
홈뉴스최신 연구 논문튜토리얼데이터셋백과사전SOTALLM 모델GPU 랭킹컨퍼런스
전체 검색
소개
한국어
HyperAIHyperAI초신경
  1. 홈
  2. SOTA
  3. 비디오-사운드 생성
  4. Video To Sound Generation On Vgg Sound

Video To Sound Generation On Vgg Sound

평가 지표

FAD
FD

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
FAD
FD
Paper TitleRepository
ReWas2.1615.24Read, Watch and Scream! Sound Generation from Text and Video-
Frieren1.3212.26Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching-
MMAudio-S-16kHz0.795.22Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis-
MaskVAT_Hybrid2.04-Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity-
MMAudio-L-44.1kHz0.974.72Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis-
V-AURA1.92-Temporally Aligned Audio for Video with Autoregression-
V2A-Mapper0.84124.168V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models-
VATT-LLama2.38-Tell What You Hear From What You See -- Video to Audio Generation Through Text-
0 of 8 row(s) selected.
HyperAI

학습, 이해, 실천, 커뮤니티와 함께 인공지능의 미래를 구축하다

한국어

소개

회사 소개데이터셋 도움말

제품

뉴스튜토리얼데이터셋백과사전

링크

TVM 한국어Apache TVMOpenBayes

© HyperAI초신경

TwitterBilibili