GPT-3 (QCM→A, 2-shot) | 73.97 | 76.80 | 68.89 | 67.28 | 76.00 | 74.64 | 77.42 | 69.74 | 74.44 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | |
GPT-3 - CoT(QCM→AE, 2-shot) | 74.61 | 78.49 | 67.63 | 66.09 | 77.55 | 76.60 | 79.58 | 65.92 | 75.51 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | |
Honeybee | 94.39 | 95.04 | 93.21 | 93.75 | 91.18 | 95.20 | 93.17 | 96.29 | 94.48 | Honeybee: Locality-enhanced Projector for Multimodal LLM | |
GPT-3 - CoT (QCM→ALE , 2-shot) | 75.17 | 78.23 | 69.68 | 67.43 | 78.09 | 75.44 | 79.93 | 70.87 | 74.68 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | |
UnifiedQA-BASE - CoT (QCM→ALE) | 74.11 | 77.06 | 68.82 | 66.53 | 78.91 | 71.00 | 81.81 | 76.04 | 66.42 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | |
Multimodal CoT | 91.68 | 92.44 | 90.31 | 88.80 | 90.82 | 95.91 | 92.89 | 82.00 | 95.26 | Multimodal Chain-of-Thought Reasoning in Language Models | |