HyperAIHyperAI

Question Answering On Drop

Métriques

Accuracy

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
Accuracy
Paper TitleRepository
PaLM 540B (Self Consistency)78.2Large Language Models Can Self-Improve-
PaLM 540B (Self Improvement, Self Consistency)83Large Language Models Can Self-Improve-
PaLM 540B (Self Improvement, Standard-Prompting)71.7Large Language Models Can Self-Improve-
PaLM 540B (Standard-Prompting)60Large Language Models Can Self-Improve-
PaLM 540B (CoT Prompting)70.6Large Language Models Can Self-Improve-
PaLM 540B (Self Improvement, CoT Prompting)76.2Large Language Models Can Self-Improve-
0 of 6 row(s) selected.