HyperAI超神経

Debugbench

評価指標

llm_model
logic_condition error
logic_operation error
logic_other error
logic_variable error
model_url
multiple_double bugs
multiple_quadraple bugs
multiple_triple bugs
organization
parameters
reference_faulty indexing
referenceillegal keywords
referenceundefined methods
referenceundefined objects
release_date
syntax_illegal comment
syntax_illegal indentation
syntax_illegal separation
syntax_missing colons
syntax_misused ==/=
syntax_unclosed parentheses
syntax_unclosed string
updated_time

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名llm_modellogic_condition errorlogic_operation errorlogic_other errorlogic_variable errormodel_urlmultiple_double bugsmultiple_quadraple bugsmultiple_triple bugsorganizationparametersreference_faulty indexingreferenceillegal keywordsreferenceundefined methodsreferenceundefined objectsrelease_datesyntax_illegal commentsyntax_illegal indentationsyntax_illegal separationsyntax_missing colonssyntax_misused ==/=syntax_unclosed parenthesessyntax_unclosed stringupdated_time
モデル 1CodeLlama-7b-Instruct13.58.3810https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf3.356.7Meta7B27.258.11521.92023.8.2531.54.47.423.318.227.128.82024.8.11