Alignbench
المقاييس
language_avg.
language_chi.
language_fund.
language_open.
language_pro.
language_role.
language_writ.
llm_model
model_url
organization
overall
parameters
reasoning_avg.
reasoning_logi.
reasoning_math.
release_date
updated_time
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
جدول المقارنة
اسم النموذج | language_avg. | language_chi. | language_fund. | language_open. | language_pro. | language_role. | language_writ. | llm_model | model_url | organization | overall | parameters | reasoning_avg. | reasoning_logi. | reasoning_math. | release_date | updated_time |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
النموذج 1 | 8.29 | 7.33 | 7.99 | 8.61 | 8.65 | 8.47 | 8.67 | gpt-4-1106-preview | https://community.openai.com/t/gpt-4-1106-preview-vs-gpt-4/588424 | OpenAI | 8.01 | N/A | 7.73 | 7.66 | 7.8 | 2023.11.6 | 2024.8.25 |