Video To Sound Generation On Vgg Sound
Metriken
FAD
FD
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | FAD | FD |
---|---|---|
read-watch-and-scream-sound-generation-from | 2.16 | 15.24 |
frieren-efficient-video-to-audio-generation | 1.32 | 12.26 |
taming-multimodal-joint-training-for-high | 0.79 | 5.22 |
masked-generative-video-to-audio-transformers | 2.04 | - |
taming-multimodal-joint-training-for-high | 0.97 | 4.72 |
temporally-aligned-audio-for-video-with | 1.92 | - |
v2a-mapper-a-lightweight-solution-for-vision | 0.841 | 24.168 |
tell-what-you-hear-from-what-you-see-video-to | 2.38 | - |