HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Common Sense Reasoning
Common Sense Reasoning On Commonsenseqa
Common Sense Reasoning On Commonsenseqa
평가 지표
Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Accuracy
Paper Title
Repository
DEKCOR
83.3
Fusing Context Into Knowledge Graph for Commonsense Question Answering
UnifiedQA 11B (fine-tuned)
79.1
UnifiedQA: Crossing Format Boundaries With a Single QA System
RoBERTa+HyKAS Ma et al. (2019)
73.2
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
-
Chain of thought ASDiv
28.6
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
KagNet
58.9
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
OPT 66B (1-shot)
66.4
BloombergGPT: A Large Language Model for Finance
-
GPT-4o (HPT)
92.54
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
-
UnifiedQA 440M (fine-tuned)
64
UnifiedQA: Crossing Format Boundaries With a Single QA System
UL2 20B (chain-of-thought)
51.4
UL2: Unifying Language Learning Paradigms
STaR without Rationalization (on GPT-J)
68.8
STaR: Bootstrapping Reasoning With Reasoning
Few-shot CoT GPT-J
36.6
STaR: Bootstrapping Reasoning With Reasoning
PaLM 2 (few‑shot, CoT, SC)
90.4
PaLM 2 Technical Report
T5-XXL 11B (fine-tuned)
78.1
UnifiedQA: Crossing Format Boundaries With a Single QA System
RoBERTa-Large 355M
72.1
RoBERTa: A Robustly Optimized BERT Pretraining Approach
GPT-3 Direct Finetuned
73.0
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
BLOOM 176B (1-shot)
64.2
BloombergGPT: A Large Language Model for Finance
-
UL2 20B (zero-shot)
34.2
UL2: Unifying Language Learning Paradigms
DeBERTaV3-large+KEAR
91.2
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
DRAGON
78.2
Deep Bidirectional Language-Knowledge Graph Pretraining
BERT_CSlarge
62.2
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
-
0 of 38 row(s) selected.
Previous
Next