Visual Question Answering On Msrvtt Qa 2
평가 지표
Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | Accuracy | Paper Title | Repository |
---|---|---|---|
Just Ask | 0.415 | Just Ask: Learning to Answer Questions from Millions of Narrated Videos | |
SSML | 0.35 | Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning | |
Aurora (ours, r=64) Aurora (ours, r=64) | - | - | - |
FrozenBiLM | 0.470 | Zero-Shot Video Question Answering via Frozen Bidirectional Language Models |
0 of 4 row(s) selected.