deepseek-ai/deepseek-coder-6.7b-instruct | 11.09 | 19.70 | 33.80 | DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | |
code-davinci-002 175B | - | - | 31.92 | CodeT: Code Generation with Generated Tests | |
GPT-Neo 2.7B | 0.00% | 0.57% | 3.90% | Measuring Coding Challenge Competence With APPS | |
AlphaCode 1B Filtered from 50000 | - | - | - | Competition-Level Code Generation with AlphaCode | |
Codex 12B (Raw) | 0.50% | 1.00% | 5.60% | Evaluating Large Language Models Trained on Code | |
code-davinci-002 175B (CodeT) | 6.2% | 14.3% | 47.3% | CodeT: Code Generation with Generated Tests | |
MapCoder APPS-150-cherrypicked (GPT-4) | 0.00% | 0.70% | 1.30% | MapCoder: Multi-Agent Code Generation for Competitive Problem Solving | |