HyperAI

Code Generation On Apps

Metrics

Competition Pass@1
Interview Pass@1
Introductory Pass@1

Results

Performance results of various models on this benchmark

Comparison Table
Model NameCompetition Pass@1Interview Pass@1Introductory Pass@1
motcoder-elevating-large-language-models-with21.1832.6354.26
deepseek-coder-when-the-large-language-model11.0919.7033.80
motcoder-elevating-large-language-models-with27.8444.4968.44
codet-code-generation-with-generated-tests--31.92
measuring-coding-challenge-competence-with0.00%0.57%3.90%
codechain-towards-modular-code-generation2.5%6.4%29.3%
coderl-mastering-code-generation-through33.313.520
coderl-mastering-code-generation-through0.69%1.80%6.77%
competition-level-code-generation-with-1---
codechain-towards-modular-code-generation3.757.4926.29
planning-driven-programming-a-large-language34.865.287.2
competition-level-code-generation-with-1---
codesim-multi-agent-code-generation-and-10.814.2126.04
evaluating-large-language-models-trained-on0.50%1.00%5.60%
coderl-mastering-code-generation-through0.02%0.14%4.14%
codet-code-generation-with-generated-tests6.2%14.3%47.3%
mapcoder-multi-agent-code-generation-for0.00%0.70%1.30%
coderl-mastering-code-generation-through0.00%0.57%3.90%