HyperAI

Code Generation On Res Q

Metriken

pass@1

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
Modellnamepass@1
res-q-evaluating-code-editing-large-language30.0
res-q-evaluating-code-editing-large-language58.0
res-q-evaluating-code-editing-large-language20.0
res-q-evaluating-code-editing-large-language18.0
res-q-evaluating-code-editing-large-language30.0
res-q-evaluating-code-editing-large-language36.0
res-q-evaluating-code-editing-large-language46.0
res-q-evaluating-code-editing-large-language29.0
res-q-evaluating-code-editing-large-language37.0