Alignbench
Metriken
language_avg.
language_chi.
language_fund.
language_open.
language_pro.
language_role.
language_writ.
llm_model
model_url
organization
overall
parameters
reasoning_avg.
reasoning_logi.
reasoning_math.
release_date
updated_time
Ergebnisse
Leistungsergebnisse verschiedener Modelle bei diesem Benchmark
| Paper Title | Code | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| API | 8.29 | 7.33 | 7.99 | 8.61 | 8.65 | 8.47 | 8.67 | gpt-4-1106-preview | https://community.openai.com/t/gpt-4-1106-preview-vs-gpt-4/588424 | OpenAI | 8.01 | N/A | 7.73 | 7.66 | 7.8 | 2023.11.6 | 2024.8.25 | - |
0 of 1 row(s) selected.