Common Sense Reasoning On Arc Easy

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Accuracy	Paper Title	Repository
LLaMA 13B (0-shot)	74.8	LLaMA: Open and Efficient Foundation Language Models
GLaM 64B/64E (0-shot)	68.0	GLaM: Efficient Scaling of Language Models with Mixture-of-Experts	-
GPT-3 175B (1 shot)	71.2	Language Models are Few-Shot Learners
ST-MoE-32B 269B (fine-tuned)	95.2	ST-MoE: Designing Stable and Transferable Sparse Expert Models
SparseGPT 175B (50% sparsity)	69.65	SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Mamba-2.8B (0-shot)	69.7	Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Camelidae-8×34B	86.2	Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
ST-MoE-L 4.1B (fine-tuned)	75.4	ST-MoE: Designing Stable and Transferable Sparse Expert Models
LLaMA 7B (0-shot)	72.8	LLaMA: Open and Efficient Foundation Language Models
UL2 20B (0-shot)	32.2	UL2: Unifying Language Learning Paradigms
SparseGPT (175B, 4:8 Sparsity)	68.35	SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Pythia 12B (0-shot)	70.2	Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
LLaMA 13B + CFG (0-shot)	79.1	Stay on topic with Classifier-Free Guidance	-
Mistral 7B (0-shot)	80.5	Mixtral of Experts
LLaMA 65B + CFG (0-shot)	84.2	Stay on topic with Classifier-Free Guidance	-
LLaMA 33B (0-shot)	80.0	LLaMA: Open and Efficient Foundation Language Models
LLaMA-2 13B + MixLoRA	83.5	MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
LLaMA-2 7B + MixLoRA	77.7	MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
LLaMA 3 8B+MoSLoRA (fine-tuned)	90.5	Mixture-of-Subspaces in Low-Rank Adaptation
OPT-175B	71.04	SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

0 of 47 row(s) selected.