HyperAI超神经

Trajectory Planning On Toolbench

评估指标

Win rate

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Win rate
fortify-the-shortest-stave-in-attention71.5
swissnyf-tool-grounded-llm-agents-for-black86.54
toolllm-facilitating-large-language-models-to70.4