HyperAI초신경

Reading Comprehension On Race

평가 지표

Accuracy
Accuracy (High)
Accuracy (Middle)

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Accuracy
Accuracy (High)
Accuracy (Middle)
Paper TitleRepository
B10-10-1085.784.488.8Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Megatron-BERT89.588.691.8Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
GPT-3 175B (zero-shot)-45.5-Language Models are Few-Shot Learners
LLaMA 33B (zero-shot)-48.364.1LLaMA: Open and Efficient Foundation Language Models
LLaMA 65B (zero-shot)-51.667.9LLaMA: Open and Efficient Foundation Language Models
RoBERTa83.281.386.5RoBERTa: A Robustly Optimized BERT Pretraining Approach
Megatron-BERT (ensemble)90.990.093.1Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
DeBERTalarge86.8--DeBERTa: Decoding-enhanced BERT with Disentangled Attention
GPT-3 175B (0-shot)--58.4Language Models are Few-Shot Learners
ALBERT (Ensemble)91.4--Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning-
BLOOM 176B (one-shot)-39.1452.3BloombergGPT: A Large Language Model for Finance-
GPT-NeoX (one-shot)-34.3341.23BloombergGPT: A Large Language Model for Finance-
OPT 66B (one-shot)-37.0247.42BloombergGPT: A Large Language Model for Finance-
Orca 2-7B80.79--Orca 2: Teaching Small Language Models How to Reason-
PaLM 8B (zero-shot)-42.357.9PaLM: Scaling Language Modeling with Pathways
XLNet-84.088.6XLNet: Generalized Autoregressive Pretraining for Language Understanding
Bloomberg GPT (one-shot)-41.7454.32BloombergGPT: A Large Language Model for Finance-
PaLM 540B (zero-shot)-49.168.1PaLM: Scaling Language Modeling with Pathways
LLaMA 7B (zero-shot)-46.961.1LLaMA: Open and Efficient Foundation Language Models
Orca 2-13B82.87--Orca 2: Teaching Small Language Models How to Reason-
0 of 24 row(s) selected.