HyperAI

Code Generation On Humaneval

المقاييس

Pass@1

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

جدول المقارنة
اسم النموذجPass@1
from-code-to-correctness-closing-the-last96.3
ldb-a-large-language-model-debugger-via98.2
hierarchical-prompting-taxonomy-a-universal100
hierarchical-prompting-taxonomy-a-universal100
aflow-automating-agentic-workflow-generation94.7
codesim-multi-agent-code-generation-and-197.6
النموذج 792.0
codesim-multi-agent-code-generation-and-198.8
l2mac-large-language-model-automatic-computer90.2
agentcoder-multi-agent-based-code-generation96.3
codesim-multi-agent-code-generation-and-195.1
mapcoder-multi-agent-code-generation-for93.9
octopack-instruction-tuning-code-large86.6
النموذج 1491.65
ldb-a-large-language-model-debugger-via99.4
qualityflow-an-agentic-workflow-for-program98.8
nexus-a-lightweight-and-scalable-multi-agent98.8
planning-driven-programming-a-large-language98.2
النموذج 1985.97
claude-3-5-sonnet-model-card-addendum90.2
metagpt-meta-programming-for-multi-agent85.9