Code Generation
Benchmark List
All benchmarks related to this task
android-repos
Best model: Entity Type Model
Metrics
View Details
apps
Best model: MapCoder APPS-150-cherrypicked (GPT-4)
Metrics
View Details
bigcodebench-instruct
Best model: GPT-4o-2024-05-13
Metrics
View Details
codecontests
Best model: MapCoder (GPT-4)
Metrics
View Details
codexglue-codesearchnet
Best model: Redcoder-ext
Metrics
View Details
conala
Best model: MarianCG
Metrics
View Details
conala-ext
Best model: BART W/ Mined
Metrics
View Details
django
Best model: MarianCG
Metrics
View Details
floco
Best model: FloCo-T5
Metrics
View Details
humaneval
Best model: AgentCoder (GPT-4)
Metrics
View Details
livecodebench
Best model: LPW (GPT-4o)
Metrics
View Details
pecc
Best model: Claude 3 Haiku
Metrics
View Details
res-q
Best model: QurrentOS-coder + Claude 3.5 Sonnet
Metrics
View Details
shellcode-ia32
Best model: CodeBERT
Metrics
View Details
taco-topics-in-algorithmic-code-generation
Best model: GPT-4
Metrics
View Details
turbulence
Best model: GPT-4
Metrics
View Details
verilogeval
Best model: Nexus (Claude 3.5 Sonnet)
Metrics
View Details
webapp1k-react
Best model: o1-preview
Metrics
View Details
wikisql
Best model: NL2SQL-RULE
Metrics
View Details
bigcodebench-complete
Metrics
View Details
concode
Metrics
View Details
dseval-leetcode
Metrics
View Details
mbpp
Metrics
View Details
multi-source-python-code-corpus
Metrics
View Details
verified-smart-contract-code-comments
Metrics
View Details
webapp1k-duo-react
Metrics
View Details