HyperAI

Question Answering On Drop Test

Metriken

F1

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameF1
question-directed-graph-attention-network-for88.38
palm-2-technical-report-185.0
language-models-are-few-shot-learners36.5
gpt-4-technical-report-180.9
neural-symbolic-reader-scalable-integration81.71
numnet-machine-reading-comprehension-with67.97
orca-2-teaching-small-language-models-how-to60.26
gpt-4-technical-report-164.1
orca-2-teaching-small-language-models-how-to57.97
reasoning-like-program-executors-187.6
drop-a-reading-comprehension-benchmark32.7
giving-bert-a-calculator-finding-operations81.78
drop-a-reading-comprehension-benchmark47.01
injecting-numerical-reasoning-skills-into72.4
a-multi-type-multi-span-network-for-reading79.88
tag-based-multi-span-extraction-in-reading80.7