HyperAI

Multi Task Language Understanding On Bbh Alg

Metriken

Average (%)

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname
Average (%)
Paper TitleRepository
Flan-PaLM 540B (3-shot, fine-tuned, CoT)61.3Scaling Instruction-Finetuned Language Models
code-davinci-002 175B (CoT)73.9Evaluating Large Language Models Trained on Code
PaLM 540B (CoT)57.6Scaling Instruction-Finetuned Language Models
Flan-PaLM 540B (3-shot, fine-tuned, CoT + SC)66.5Scaling Instruction-Finetuned Language Models
PaLM 540B38.3Scaling Instruction-Finetuned Language Models
Flan-PaLM 540B (3-shot, fine-tuned)48.2Scaling Instruction-Finetuned Language Models
PaLM 540B (CoT + self-consistency)62.2Scaling Instruction-Finetuned Language Models
0 of 7 row(s) selected.