E Eval

メトリクス

0-shot answer-only

5-shot answer-only

5-shot cot

average

llm_model

model_url

organization

parameters

release_date

updated_time

結果

このベンチマークにおける各種モデルのパフォーマンス結果

											論文タイトル	コード
API	89.0	88.7	88.8	88.8	Qwen-72b	https://huggingface.co/Qwen	Qwen	72B	2023.8.5	2024.8.11	-

0 of 1 row(s) selected.

E Eval

メトリクス

0-shot answer-only

5-shot answer-only

5-shot cot

average

llm_model

model_url

organization

parameters

release_date

updated_time

結果

このベンチマークにおける各種モデルのパフォーマンス結果

											論文タイトル	コード
API	89.0	88.7	88.8	88.8	Qwen-72b	https://huggingface.co/Qwen	Qwen	72B	2023.8.5	2024.8.11	-

0 of 1 row(s) selected.