Code Generation On Mbpp
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | Accuracy |
---|---|
llama-open-and-efficient-foundation-language-1 | 30.2 |
code-llama-open-foundation-models-for-code | 49.4 |
code-llama-open-foundation-models-for-code | 41.4 |
language-agent-tree-search-unifies-reasoning | 81.1 |
teaching-large-language-models-to-self-debug | 53.2 |
codet-code-generation-with-generated-tests | 55.4 |
planning-driven-programming-a-large-language | 84.8 |
coder-reviewer-reranking-for-code-generation | 46.2 |
llama-2-open-foundation-and-fine-tuned-chat | 33 |
intervenor-prompt-the-coding-ability-of-large | 39.8 |
mapcoder-multi-agent-code-generation-for | 83.1 |
chatgpt-for-software-security-exploring-the | 87.5 |
codet-code-generation-with-generated-tests | 67.7 |
agentcoder-multi-agent-based-code-generation | 89.9 |
chatgpt-for-software-security-exploring-the | 71.4 |
code-llama-open-foundation-models-for-code | 44.4 |
language-agent-tree-search-unifies-reasoning | 82.3 |
palm-scaling-language-modeling-with-pathways-1 | 47 |
llama-2-open-foundation-and-fine-tuned-chat | 20.8 |
code-llama-open-foundation-models-for-code | 65.5 |
codesim-multi-agent-code-generation-and-1 | 90.7 |
llama-open-and-efficient-foundation-language-1 | 22 |
lever-learning-to-verify-language-to-code | 68.9 |
starcoder-may-the-source-be-with-you | 35 |
intervenor-prompt-the-coding-ability-of-large | 45.4 |
qualityflow-an-agentic-workflow-for-program | 94.2 |
when-llm-based-code-generation-meets-the | 83.8±0.6 |
the-claude-3-model-family-opus-sonnet-haiku | 80.4 |
branch-train-mix-mixing-expert-llms-into-a | 39.4 |
coder-reviewer-reranking-for-code-generation | 48.3 |
coder-reviewer-reranking-for-code-generation | 66.4 |
natural-language-to-code-translation-with | 58.2 |
teaching-large-language-models-to-self-debug | 70.8 |
coder-reviewer-reranking-for-code-generation | 26.1 |
code-llama-open-foundation-models-for-code | 52.2 |
palm-scaling-language-modeling-with-pathways-1 | 36.8 |
code-llama-open-foundation-models-for-code | 56.2 |
coder-reviewer-reranking-for-code-generation | 66.9 |
codegeex-a-pre-trained-model-for-code | 24.4 |
Modell 40 | 90.0 |
deepseek-coder-when-the-large-language-model | 66 |
mistral-7b | 47.5 |
code-llama-open-foundation-models-for-code | 62.4 |
coder-reviewer-reranking-for-code-generation | 44.1 |
chatgpt-for-software-security-exploring-the | 76.2 |
palm-2-technical-report-1 | 50 |
agentcoder-multi-agent-based-code-generation | 91.8 |
code-llama-open-foundation-models-for-code | 57 |
aflow-automating-agentic-workflow-generation | 83.4 |
llama-open-and-efficient-foundation-language-1 | 37.7 |
chatgpt-for-software-security-exploring-the | 82 |
deepseek-coder-when-the-large-language-model | 60.6 |
deepseek-coder-when-the-large-language-model | 70.8 |
code-llama-open-foundation-models-for-code | 49 |
deepseek-coder-when-the-large-language-model | 70 |
the-claude-3-model-family-opus-sonnet-haiku | 79.4 |
code-llama-open-foundation-models-for-code | 62.2 |
incoder-a-generative-model-for-code-infilling | 19.4 |
codet-code-generation-with-generated-tests | 49.5 |
from-code-to-correctness-closing-the-last | 80.8 |
starcoder-may-the-source-be-with-you | 52.7 |
deepseek-coder-when-the-large-language-model | 65.4 |
parameter-efficient-sparsity-crafting-from | 41.4 |
coder-reviewer-reranking-for-code-generation | 47.3 |
mixtral-of-experts | 60.7 |
llama-2-open-foundation-and-fine-tuned-chat | 45 |
the-claude-3-model-family-opus-sonnet-haiku | 86.4 |
codet-code-generation-with-generated-tests | 61.9 |
teaching-large-language-models-to-self-debug | 61.4 |
code-llama-open-foundation-models-for-code | 47.6 |
code-llama-open-foundation-models-for-code | 61.2 |
llama-2-open-foundation-and-fine-tuned-chat | 30.6 |
parameter-efficient-sparsity-crafting-from | 48.6 |
chatgpt-for-software-security-exploring-the | 83.2 |
branch-train-mix-mixing-expert-llms-into-a | 42.6 |
textbooks-are-all-you-need-ii-phi-1-5 | 43.5 |
wizardcoder-empowering-code-large-language | 51.8 |
coder-reviewer-reranking-for-code-generation | 63 |
mapcoder-multi-agent-code-generation-for | 89.7 |
code-llama-open-foundation-models-for-code | 47 |
teaching-large-language-models-to-self-debug | 72.8 |
deepseek-coder-when-the-large-language-model | 46.2 |
mapcoder-multi-agent-code-generation-for | 93.2 |
starcoder-2-and-the-stack-v2-the-next | 66.2 |
teaching-large-language-models-to-self-debug | 67.6 |
starcoder-may-the-source-be-with-you | 49 |
code-llama-open-foundation-models-for-code | 55 |
llama-open-and-efficient-foundation-language-1 | 17.7 |
coder-reviewer-reranking-for-code-generation | 26.7 |
codet-code-generation-with-generated-tests | 34.4 |
intervenor-prompt-the-coding-ability-of-large | 69.8 |
deepseek-coder-when-the-large-language-model | 49.4 |
teaching-large-language-models-to-self-debug | 80.2 |
teaching-large-language-models-to-self-debug | 47.2 |
deepseek-coder-when-the-large-language-model | 80 |
coder-reviewer-reranking-for-code-generation | 24.4 |