Question Answering On Piqa

Metriken

Accuracy

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname	Accuracy	Paper Title	Repository
Open-LLaMA-3B-v2	76.2	Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA 33B (0-shot)	82.3	LLaMA: Open and Efficient Foundation Language Models
DeBERTa-Large 304M (classification-based)	85.9	Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering
OPT 66B (1-shot)	77.6	BloombergGPT: A Large Language Model for Finance	-
LLaMA 2 13B (0-shot)	80.5	Llama 2: Open Foundation and Fine-Tuned Chat Models
LLaMA 2 34B (0-shot)	81.9	Llama 2: Open Foundation and Fine-Tuned Chat Models
UnifiedQA 3B	85.3	UnifiedQA: Crossing Format Boundaries With a Single QA System
ExDeBERTa 567M	85.5	Task Compass: Scaling Multi-task Pre-training with Task Prefix
GPT-2-XL 1.5B	70.5	LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
PaLM 2-M (1-shot)	83.2	PaLM 2 Technical Report
Sheared-LLaMA-2.7B	75.8	Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA-3 8B + MixLoRA	87.6	MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
GPT-2-small 124M (fine-tuned)	69.2	PIQA: Reasoning about Physical Commonsense in Natural Language
LLaMA-2 7B + MixLoRA	83.2	MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
SparseGPT 175B (50% Sparsity)	80.63	SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
GPT-3 175B (0-shot)	81.0	Language Models are Few-Shot Learners
LLaMA3 8B+MoSLoRA	89.7	Mixture-of-Subspaces in Low-Rank Adaptation
Sheared-LLaMA-1.3B	73.4	Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA 7B (0-shot)	79.8	LLaMA: Open and Efficient Foundation Language Models
LaMini-F-T5 783M	70.6	LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

0 of 67 row(s) selected.