5 months ago

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu

Abstract

Search has emerged as core infrastructure for LLM-based agents and is widelyviewed as critical on the path toward more general intelligence. Finance is aparticularly demanding proving ground: analysts routinely conduct complex,multi-step searches over time-sensitive, domain-specific data, making it idealfor assessing both search proficiency and knowledge-grounded reasoning. Yet noexisting open financial datasets evaluate data searching capability ofend-to-end agents, largely because constructing realistic, complicated tasksrequires deep financial expertise and time-sensitive data is hard to evaluate.We present FinSearchComp, the first fully open-source agent benchmark forrealistic, open-domain financial search and reasoning. FinSearchComp comprisesthree tasks -- Time-Sensitive Data Fetching, Simple Historical Lookup, andComplex Historical Investigation -- closely reproduce real-world financialanalyst workflows. To ensure difficulty and reliability, we engage 70professional financial experts for annotation and implement a rigorousmulti-stage quality-assurance pipeline. The benchmark includes 635 questionsspanning global and Greater China markets, and we evaluate 21 models (products)on it. Grok 4 (web) tops the global subset, approaching expert-level accuracy.DouBao (web) leads on the Greater China subset. Experimental analyses show thatequipping agents with web search and financial plugins substantially improvesresults on FinSearchComp, and the country origin of models and tools impactperformance significantly.By aligning with realistic analyst tasks andproviding end-to-end evaluation, FinSearchComp offers a professional,high-difficulty testbed for complex financial search and reasoning.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu13 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu13 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu13 more

Abstract

Build AI with AI

HyperAI Newsletters

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu

Liang Hu Jianpeng Jiao Jiashuo Liu Yanle Ren Zhoufutu Wen Kaiyuan Zhang Xuanliang Zhang Xiang Gao Tianci He Fei Hu