HyperAI

Arithmetic Reasoning On Game Of 24

Metrics

Success

Results

Performance results of various models on this benchmark

Comparison Table
Model NameSuccess
tree-of-thoughts-deliberate-problem-solving-10.74