Discuss on Discord

Date

a year ago

Size

5.21 MB

Organization

Publish URL

Paper URL

Tags

ComplexFuncBench stands for Complex Function Calling Benchmark, which is a benchmark dataset for evaluating the capabilities of large language models (LLMs) in complex function calling scenarios. The dataset was developed by researchers from Zhipu AI and Tsinghua University in 2025 to fill the gaps in existing benchmarks in terms of multi-step and restricted function calls. The relevant paper results are "ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario".

The dataset covers 1k complex function call samples in 5 real-world scenarios, including 600 single-domain samples, 150 each for hotels, flights, car rentals, and attractions, and 400 cross-domain samples. The taxi domain has only 2 functions, so it is only used for cross-domain. Compared with existing benchmarks, ComplexFuncBench contains multi-step and constrained function calls, requires long parameter archiving, parameter value reasoning, and 128k long context.

ComplexFuncBench.torrent

Seeding 0Downloading 1Completed 82Total Downloads 225

ComplexFuncBench/
- README.md
  1.6 KB
- README.txt
  3.2 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Discuss on Discord

Date

a year ago

Size

5.21 MB

Organization

Publish URL

Paper URL

arxiv.org

Tags

ComplexFuncBench stands for Complex Function Calling Benchmark, which is a benchmark dataset for evaluating the capabilities of large language models (LLMs) in complex function calling scenarios. The dataset was developed by researchers from Zhipu AI and Tsinghua University in 2025 to fill the gaps in existing benchmarks in terms of multi-step and restricted function calls. The relevant paper results are "ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario".

The dataset covers 1k complex function call samples in 5 real-world scenarios, including 600 single-domain samples, 150 each for hotels, flights, car rentals, and attractions, and 400 cross-domain samples. The taxi domain has only 2 functions, so it is only used for cross-domain. Compared with existing benchmarks, ComplexFuncBench contains multi-step and constrained function calls, requires long parameter archiving, parameter value reasoning, and 128k long context.

ComplexFuncBench.torrent

Seeding 0Downloading 1Completed 82Total Downloads 225

ComplexFuncBench/
- README.md
  1.6 KB
- README.txt
  3.2 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp