Naturalcodebench
Metrics
humanevalscore
llm_model
model_url
ncb total score
ncb(en)-java
ncb(en)-python
ncb(en)-total
ncb(zh)-java
ncb(zh)-python
ncb(zh)-total
organization
parameters
release_date
updated_time
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | humanevalscore | llm_model | model_url | ncb total score | ncb(en)-java | ncb(en)-python | ncb(en)-total | ncb(zh)-java | ncb(zh)-python | ncb(zh)-total | organization | parameters | release_date | updated_time |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model 1 | 80.5 | GPT-4 | https://github.com/topics/gpt-4 | 52.8 | 51.1 | 55.7 | 53.4 | 51.1 | 53.4 | 52.3 | OpenAI | N/A | 2023.3.14 | 2024.8.11 |