Multi Modal Classification On Vgg Sound
Metrics
Top-1 Accuracy
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Top-1 Accuracy |
---|---|
uavm-a-unified-model-for-audio-visual | 65.8 |
multiscale-multimodal-transformer-for | 66.2 |
contrastive-audio-visual-masked-autoencoder | 65.9 |
avt-audio-video-transformer-for-multimodal | 63.9 |