HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

ReSum: Unlocking Long-Horizon Search Intelligence via Context
  Summarization

Abstract

Large Language Model (LLM)-based web agents demonstrate strong performance onknowledge-intensive tasks but are hindered by context window limitations inparadigms like ReAct. Complex queries involving multiple entities, intertwinedrelationships, and high uncertainty demand extensive search cycles that rapidlyexhaust context budgets before reaching complete solutions. To overcome thischallenge, we introduce ReSum, a novel paradigm that enables indefiniteexploration through periodic context summarization. ReSum converts growinginteraction histories into compact reasoning states, maintaining awareness ofprior discoveries while bypassing context constraints. For paradigm adaptation,we propose ReSum-GRPO, integrating GRPO with segmented trajectory training andadvantage broadcasting to familiarize agents with summary-conditionedreasoning. Extensive experiments on web agents of varying scales across threebenchmarks demonstrate that ReSum delivers an average absolute improvement of4.5\% over ReAct, with further gains of up to 8.2\% following ReSum-GRPOtraining. Notably, with only 1K training samples, our WebResummer-30B (aReSum-GRPO-trained version of WebSailor-30B) achieves 33.3\% Pass@1 onBrowseComp-zh and 18.3\% on BrowseComp-en, surpassing existing open-source webagents.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization | Papers | HyperAI