HyperAI

Reading Comprehension On Race

Métriques

Accuracy
Accuracy (High)
Accuracy (Middle)

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
Accuracy
Accuracy (High)
Accuracy (Middle)
Paper TitleRepository
B10-10-1085.784.488.8Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Megatron-BERT89.588.691.8Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
GPT-3 175B (zero-shot)-45.5-Language Models are Few-Shot Learners
LLaMA 33B (zero-shot)-48.364.1LLaMA: Open and Efficient Foundation Language Models
LLaMA 65B (zero-shot)-51.667.9LLaMA: Open and Efficient Foundation Language Models
RoBERTa83.281.386.5RoBERTa: A Robustly Optimized BERT Pretraining Approach
Megatron-BERT (ensemble)90.990.093.1Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
DeBERTalarge86.8--DeBERTa: Decoding-enhanced BERT with Disentangled Attention
GPT-3 175B (0-shot)--58.4Language Models are Few-Shot Learners
ALBERT (Ensemble)91.4--Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning-
BLOOM 176B (one-shot)-39.1452.3BloombergGPT: A Large Language Model for Finance-
GPT-NeoX (one-shot)-34.3341.23BloombergGPT: A Large Language Model for Finance-
OPT 66B (one-shot)-37.0247.42BloombergGPT: A Large Language Model for Finance-
Orca 2-7B80.79--Orca 2: Teaching Small Language Models How to Reason-
PaLM 8B (zero-shot)-42.357.9PaLM: Scaling Language Modeling with Pathways
XLNet-84.088.6XLNet: Generalized Autoregressive Pretraining for Language Understanding
Bloomberg GPT (one-shot)-41.7454.32BloombergGPT: A Large Language Model for Finance-
PaLM 540B (zero-shot)-49.168.1PaLM: Scaling Language Modeling with Pathways
LLaMA 7B (zero-shot)-46.961.1LLaMA: Open and Efficient Foundation Language Models
Orca 2-13B82.87--Orca 2: Teaching Small Language Models How to Reason-
0 of 24 row(s) selected.