HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Logical Reasoning
Logical Reasoning On Lingoly
Logical Reasoning On Lingoly
评估指标
Delta_NoContext
Exact Match Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Delta_NoContext
Exact Match Accuracy
Paper Title
Repository
Gemini 1.5 Pro
23.4%
32.1%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
GPT-4
21.5%
33.4%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
GPT-3.5
11.2%
21.2%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Claude Opus
28.8%
46.3%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Command R+
11.6%
21.5%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Llama 3 8B
4.9%
11.4%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Llama 3 70B
2.9%
10.3%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Llama 2 70B
1.1%
6.4%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
GPT-4o
25.1%
37.6%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Mixtral 8x7B
6.4%
14.2%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Gemma 7B
2.2%
4.9%
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
0 of 11 row(s) selected.
Previous
Next