Multi Modal Classification On Vgg Sound
评估指标
Top-1 Accuracy
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Top-1 Accuracy | Paper Title | Repository |
---|---|---|---|
UAVM | 65.8 | UAVM: Towards Unifying Audio and Visual Models | |
MMT | 66.2 | Multiscale Multimodal Transformer for Multimodal Action Recognition | - |
CAV-MAE (Audio-Visual) | 65.9 | Contrastive Audio-Visual Masked Autoencoder | |
AVT | 63.9 | AVT: Audio-Video Transformer for Multimodal Action Recognition | - |
0 of 4 row(s) selected.