Question Answering On Multirc

Metriken

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname	EM	Paper Title	Repository
Hybrid H3 125M (3-shot, logit scoring)	48.9	Hungry Hungry Hippos: Towards Language Modeling with State Space Models
FLAN 137B (1-shot)	-	Finetuned Language Models Are Zero-Shot Learners
DeBERTa-1.5B	63.7	DeBERTa: Decoding-enhanced BERT with Disentangled Attention
BLOOM 176B (1-shot)	-	BloombergGPT: A Large Language Model for Finance
T5-XXL 11B (fine-tuned)	-	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
GPT-NeoX 20B (1-shot)	-	BloombergGPT: A Large Language Model for Finance
T5-11B	63.3	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
OPT 66B (1-shot)	-	BloombergGPT: A Large Language Model for Finance
KELM (finetuning BERT-large based single model)	27.2	KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs
Hybrid H3 355M (0-shot, logit scoring)	59.5	Hungry Hungry Hippos: Towards Language Modeling with State Space Models
PaLM 2-S (one-shot)	-	PaLM 2 Technical Report
ST-MoE-L 4.1B (fine-tuned)	-	ST-MoE: Designing Stable and Transferable Sparse Expert Models
BERT-large(single model)	24.1	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT-3 175B (Few-Shot)	-	Language Models are Few-Shot Learners
Hybrid H3 355M (3-shot, logit scoring)	59.7	Hungry Hungry Hippos: Towards Language Modeling with State Space Models
PaLM 540B (finetuned)	69.2	PaLM: Scaling Language Modeling with Pathways
Neo-6B (QA)	-	Ask Me Anything: A simple strategy for prompting language models
Hybrid H3 125M (0-shot, logit scoring)	51.4	Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Turing NLR v5 XXL 5.4B (fine-tuned)	63	Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE	-
Neo-6B (few-shot)	-	Ask Me Anything: A simple strategy for prompting language models

0 of 30 row(s) selected.