HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Common Sense Reasoning
Common Sense Reasoning On Commonsenseqa
Common Sense Reasoning On Commonsenseqa
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Accuracy
Paper Title
Repository
DEKCOR
83.3
Fusing Context Into Knowledge Graph for Commonsense Question Answering
UnifiedQA 11B (fine-tuned)
79.1
UnifiedQA: Crossing Format Boundaries With a Single QA System
RoBERTa+HyKAS Ma et al. (2019)
73.2
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
-
Chain of thought ASDiv
28.6
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
KagNet
58.9
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
OPT 66B (1-shot)
66.4
BloombergGPT: A Large Language Model for Finance
-
GPT-4o (HPT)
92.54
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
-
UnifiedQA 440M (fine-tuned)
64
UnifiedQA: Crossing Format Boundaries With a Single QA System
UL2 20B (chain-of-thought)
51.4
UL2: Unifying Language Learning Paradigms
STaR without Rationalization (on GPT-J)
68.8
STaR: Bootstrapping Reasoning With Reasoning
Few-shot CoT GPT-J
36.6
STaR: Bootstrapping Reasoning With Reasoning
PaLM 2 (few‑shot, CoT, SC)
90.4
PaLM 2 Technical Report
T5-XXL 11B (fine-tuned)
78.1
UnifiedQA: Crossing Format Boundaries With a Single QA System
RoBERTa-Large 355M
72.1
RoBERTa: A Robustly Optimized BERT Pretraining Approach
GPT-3 Direct Finetuned
73.0
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
BLOOM 176B (1-shot)
64.2
BloombergGPT: A Large Language Model for Finance
-
UL2 20B (zero-shot)
34.2
UL2: Unifying Language Learning Paradigms
DeBERTaV3-large+KEAR
91.2
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
DRAGON
78.2
Deep Bidirectional Language-Knowledge Graph Pretraining
BERT_CSlarge
62.2
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
-
0 of 38 row(s) selected.
Previous
Next