HyperAI超神经

Multi Task Language Understanding On Bbh Nlp

评估指标

Average (%)

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Average (%)
模型 186.3
scaling-instruction-finetuned-language-models71.2
orca-2-teaching-small-language-models-how-to45.93
scaling-instruction-finetuned-language-models62.7
scaling-instruction-finetuned-language-models70.0
scaling-instruction-finetuned-language-models78.4
scaling-instruction-finetuned-language-models78.2
orca-2-teaching-small-language-models-how-to50.18
evaluating-large-language-models-trained-on73.5
模型 1082.4
模型 1186.1
scaling-instruction-finetuned-language-models72.4
模型 1385.9
模型 1484.07
模型 1581.0