Question Answering On Finqa
評価指標
Execution Accuracy
Program Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Execution Accuracy | Program Accuracy |
---|---|---|
elastic-numerical-reasoning-with-adaptive | 68.96 | 65.21 |
finqa-a-dataset-of-numerical-reasoning-over | 57.43 | 55.52 |
finqa-a-dataset-of-numerical-reasoning-over | 65.05 | 63.52 |
are-chatgpt-and-gpt-4-general-purpose-solvers | 68.79 | - |
apollo-an-optimized-training-approach-for | 71.07 | 68.94 |
finqa-a-dataset-of-numerical-reasoning-over | 53.71 | 51.71 |