HyperAI

Trajectory Planning On Toolbench

Metrics

Win rate

Results

Performance results of various models on this benchmark

Comparison Table
Model NameWin rate
fortify-the-shortest-stave-in-attention71.5
swissnyf-tool-grounded-llm-agents-for-black86.54
toolllm-facilitating-large-language-models-to70.4