HyperAI超神経

Video To Sound Generation On Vgg Sound

評価指標

FAD
FD

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名FADFD
read-watch-and-scream-sound-generation-from2.1615.24
frieren-efficient-video-to-audio-generation1.3212.26
taming-multimodal-joint-training-for-high0.795.22
masked-generative-video-to-audio-transformers2.04-
taming-multimodal-joint-training-for-high0.974.72
temporally-aligned-audio-for-video-with1.92-
v2a-mapper-a-lightweight-solution-for-vision0.84124.168
tell-what-you-hear-from-what-you-see-video-to2.38-