deepseek-ai/deepseek-coder-6.7b-instruct | 11.09 | 19.70 | 33.80 | DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | - |
code-davinci-002 175B | - | - | 31.92 | CodeT: Code Generation with Generated Tests | - |
GPT-Neo 2.7B | 0.00% | 0.57% | 3.90% | Measuring Coding Challenge Competence With APPS | - |
AlphaCode 1B Filtered from 50000 | - | - | - | Competition-Level Code Generation with AlphaCode | - |
Codex 12B (Raw) | 0.50% | 1.00% | 5.60% | Evaluating Large Language Models Trained on Code | - |
code-davinci-002 175B (CodeT) | 6.2% | 14.3% | 47.3% | CodeT: Code Generation with Generated Tests | - |
MapCoder APPS-150-cherrypicked (GPT-4) | 0.00% | 0.70% | 1.30% | MapCoder: Multi-Agent Code Generation for Competitive Problem Solving | - |