Multi Modal Classification On Vgg Sound
評価指標
Top-1 Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Top-1 Accuracy | Paper Title | Repository |
---|---|---|---|
UAVM | 65.8 | UAVM: Towards Unifying Audio and Visual Models | |
MMT | 66.2 | Multiscale Multimodal Transformer for Multimodal Action Recognition | - |
CAV-MAE (Audio-Visual) | 65.9 | Contrastive Audio-Visual Masked Autoencoder | |
AVT | 63.9 | AVT: Audio-Video Transformer for Multimodal Action Recognition | - |
0 of 4 row(s) selected.