HyperAI超神経

Mathematical Reasoning On Lila Iid

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
Accuracy
Paper TitleRepository
Bhāskara-A (Fine-tuned, 2.7B)0.252Lila: A Unified Benchmark for Mathematical Reasoning
Neo-P (Fine-tuned, 2.7B)0.394Lila: A Unified Benchmark for Mathematical Reasoning
Bhāskara-P (Fine-tuned, 2.7B)0.48Lila: A Unified Benchmark for Mathematical Reasoning
GPT-3 (Few-Shot, 175B)0.384Lila: A Unified Benchmark for Mathematical Reasoning
Neo-A (Fine-tuned, 2.7B)0.204Lila: A Unified Benchmark for Mathematical Reasoning
Codex (Few-Shot, 175B)0.604Lila: A Unified Benchmark for Mathematical Reasoning
0 of 6 row(s) selected.