HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
自然言語推論
Natural Language Inference On Anli Test
Natural Language Inference On Anli Test
評価指標
A1
A2
A3
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
A1
A2
A3
Paper Title
T5-3B (explanation prompting)
81.8
72.5
74.8
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
PaLM 540B (Self Improvement, Self Consistency)
-
66.5
67.9
Large Language Models Can Self-Improve
PaLM 540B (Self Improvement, CoT Prompting)
-
65.3
67.3
Large Language Models Can Self-Improve
PaLM 540B (Self Improvement, Standard-Prompting)
-
64.8
66.9
Large Language Models Can Self-Improve
PaLM 540B (Self Consistency)
-
64.5
63.4
Large Language Models Can Self-Improve
PaLM 2-L (one-shot)
73.1
63.4
67.1
PaLM 2 Technical Report
T0-11B (explanation prompting)
75.6
60.6
59.9
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
PaLM 540B (CoT Prompting)
-
58.9
60.6
Large Language Models Can Self-Improve
PaLM 540B (Standard-Prompting)
-
55.8
55.8
Large Language Models Can Self-Improve
ChatGPT
62.3
52.6
54.1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
ALUM (RoBERTa-LARGE)
72.3
52.1
48.4
Adversarial Training for Large Neural Language Models
XLNet (Large)
70.3
50.9
49.4
XLNet: Generalized Autoregressive Pretraining for Language Understanding
InfoBERT (RoBERTa)
75
50.5
47.7
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
RoBERTa (Large)
72.4
49.8
44.4
RoBERTa: A Robustly Optimized BERT Pretraining Approach
PaLM 2-M (one-shot)
58.1
49.5
54.5
PaLM 2 Technical Report
PaLM 2-S (one-shot)
53.1
48.8
53.2
PaLM 2 Technical Report
T0-3B (CoT fine-tuned)
41.7
37.2
41.9
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Flipped-3B
39.99
37.05
37.73
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
KiC-770M
36.30
35.00
37.60
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
RoE-3B
35.49
34.64
31.22
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
0 of 25 row(s) selected.
Previous
Next
Natural Language Inference On Anli Test | SOTA | HyperAI超神経