HyperAI
Startseite
Neuigkeiten
Neueste Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Deutsch
HyperAI
Toggle sidebar
Seite durchsuchen…
⌘
K
Startseite
SOTA
Fragebeantwortung
Question Answering On Newsqa
Question Answering On Newsqa
Metriken
EM
F1
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
EM
F1
Paper Title
Repository
deepseek-r1
80.57
86.13
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
OpenAI/GPT-4o
70.21
81.74
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data
-
DecaProp
53.1
66.3
Densely Connected Attention Propagation for Reading Comprehension
FastQAExt
43.7
56.1
Making Neural QA as Simple as Possible but not Simpler
Riple/Saanvi-v0.1
72.61
85.44
Time-series Transformer Generative Adversarial Networks
LinkBERT (large)
-
72.6
LinkBERT: Pretraining Language Models with Document Links
BERT+ASGen
54.7
64.5
-
-
Anthropic/claude-3-5-sonnet
74.23
82.3
Claude 3.5 Sonnet Model Card Addendum
-
xAI/grok-2-1212
70.57
88.24
XAI for Transformers: Better Explanations through Conservative Propagation
OpenAI/o1-2024-12-17-high
81.44
88.7
0/1 Deep Neural Networks via Block Coordinate Descent
-
Google/Gemini 1.5 Flash
68.75
79.91
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
AMANDA
48.4
63.7
A Question-Focused Multi-Factor Attention Network for Question Answering
OpenAI/o3-mini-2025-01-31-high
96.52
92.13
o3-mini vs DeepSeek-R1: Which One is Safer?
DyREX
-
68.53
DyREx: Dynamic Query Representation for Extractive Question Answering
MINIMAL(Dyn)
50.1
63.2
Efficient and Robust Question Answering from Minimal Context over Documents
SpanBERT
-
73.6
SpanBERT: Improving Pre-training by Representing and Predicting Spans
0 of 16 row(s) selected.
Previous
Next