HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
서비스 약관
개인정보 처리방침
한국어
HyperAI
HyperAI초신경
Toggle Sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
플랫폼
홈
SOTA
자연어 추론
Natural Language Inference On Anli Test
Natural Language Inference On Anli Test
평가 지표
A1
A2
A3
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
A1
A2
A3
Paper Title
T5-3B (explanation prompting)
81.8
72.5
74.8
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
PaLM 540B (Self Improvement, Self Consistency)
-
66.5
67.9
Large Language Models Can Self-Improve
PaLM 540B (Self Improvement, CoT Prompting)
-
65.3
67.3
Large Language Models Can Self-Improve
PaLM 540B (Self Improvement, Standard-Prompting)
-
64.8
66.9
Large Language Models Can Self-Improve
PaLM 540B (Self Consistency)
-
64.5
63.4
Large Language Models Can Self-Improve
PaLM 2-L (one-shot)
73.1
63.4
67.1
PaLM 2 Technical Report
T0-11B (explanation prompting)
75.6
60.6
59.9
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
PaLM 540B (CoT Prompting)
-
58.9
60.6
Large Language Models Can Self-Improve
PaLM 540B (Standard-Prompting)
-
55.8
55.8
Large Language Models Can Self-Improve
ChatGPT
62.3
52.6
54.1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
ALUM (RoBERTa-LARGE)
72.3
52.1
48.4
Adversarial Training for Large Neural Language Models
XLNet (Large)
70.3
50.9
49.4
XLNet: Generalized Autoregressive Pretraining for Language Understanding
InfoBERT (RoBERTa)
75
50.5
47.7
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
RoBERTa (Large)
72.4
49.8
44.4
RoBERTa: A Robustly Optimized BERT Pretraining Approach
PaLM 2-M (one-shot)
58.1
49.5
54.5
PaLM 2 Technical Report
PaLM 2-S (one-shot)
53.1
48.8
53.2
PaLM 2 Technical Report
T0-3B (CoT fine-tuned)
41.7
37.2
41.9
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Flipped-3B
39.99
37.05
37.73
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
KiC-770M
36.30
35.00
37.60
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
RoE-3B
35.49
34.64
31.22
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
0 of 25 row(s) selected.
Previous
Next
Natural Language Inference On Anli Test | SOTA | HyperAI초신경