HyperAIHyperAI

Command Palette

Search for a command to run...

AI Paper Weekly Report | Latest Developments in AI Agents: PaperBanana/Lumine/Insight Agents... A Comprehensive Technical Overview

Featured Image

From "large-scale models capable of dialogue" to "AI agents capable of autonomously completing complex tasks," artificial intelligence research is entering a new phase centered on planning, execution, and collaboration. As large language models gradually acquire the abilities to invoke tools, retain long-term memory, and interact with the environment,The research focus is no longer limited to improving the performance of a single model, but has shifted to how to enable AI to continuously produce verifiable and reusable results in the real world through multi-agent architecture and task-level division of labor.

Against this backdrop, agent technology is rapidly penetrating multiple fields such as scientific research and production, software development, data analysis, and virtual environment interaction: from automatically generating high-quality academic illustrations and completing reinforcement learning optimization without explicit rewards, to performing long-term tasks in three-dimensional open worlds, and even systematizing fuzzy research ideas into complete scientific narratives.The academic and industrial communities are conducting intensive research on "how to make models truly become executors rather than just generators".

This week, we recommend 5 popular AI papers on agents.The presentation, featuring teams from Peking University, Google Cloud AI Research, AgentAlpha, Amazon, and others, showcases representative advancements in Agent research, including framework design, cross-modal collaboration, self-feedback learning, and end-to-end task closure, providing a clear perspective on the evolution of next-generation general-purpose agents. Let's learn together! ⬇️

In addition, to allow more users to understand the latest developments in the field of artificial intelligence in academia, the HyperAI website (hyper.ai) has launched a "Latest Papers" section, which is updated daily with cutting-edge AI research papers.

Latest AI Papers:https://go.hyper.ai/hzChC

This week's paper recommendation

  1. PaperBanana: Automating Academic Illustration for AI Scientists

Researchers from Peking University and Google Cloud AI Research Institute have proposed PaperBanana, an agent-based framework that automatically completes the retrieval, planning, stylization, and iterative optimization of publication-quality academic illustrations by coordinating agents driven by a specialized visual language model (VLM). It significantly outperforms baseline methods in terms of fidelity, simplicity, readability, and aesthetics of method graphs and statistical graphs.

Paper and detailed interpretation:https://go.hyper.ai/skQUQ

Effect display

The authors used PaperBanana (a benchmark built on the NeurIPS 2025 method graph) to evaluate automated graph generation. This benchmark covers a wide variety of aesthetically complex graphs in modern AI papers.

2. Reinforcement Learning via Self-Distillation

This paper proposes Self-Distillation Policy Optimization (SDPO). SDPO transforms post-segmentation feedback into dense learning signals without requiring an external teacher model or explicit reward model. SDPO treats the current model's output under given feedback conditions as a self-teacher, feeding back its next-word prediction based on the feedback and distilling it into the policy. In this way, SDPO fully leverages the model's ability to backtrack and identify its own errors within the context. In scientific reasoning, tool use, and competitive programming tasks on LiveCodeBench v6, SDPO significantly outperforms existing strong benchmark RLVR methods in both sample efficiency and final accuracy.

Paper and detailed interpretation:https://go.hyper.ai/oBMuM

Example of experimental comparison between RLVR and RLRF

3. Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

This paper proposes Lumine, the first open-source general-purpose intelligent agent development scheme, capable of executing complex tasks for hours in real time in complex 3D open-world environments. Lumine adopts a human-like interaction paradigm, unifying perception, reasoning, and action in an end-to-end manner through a vision-language model. It processes raw pixel input at a frequency of 5 frames per second, generates precise keyboard and mouse operations at 30 frames per second, and dynamically invokes the inference module only when necessary.

Paper and detailed interpretation:https://go.hyper.ai/aUakj

Effect display

Experimental results show that Lumine has high adaptability under different world settings and interaction mechanisms, marking an important step towards becoming a general-purpose intelligent agent in open environments.

Example of Lumine performance comparison experiment results

4. Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

The AgentAlpha team proposed Idea2Story, a pre-computation framework that transforms vague research ideas into structured, reusable patterns by building methodological knowledge graphs from peer-reviewed papers. This reduces the contextual constraints and illusions of large language models, while enabling efficient and novel scientific discoveries without the need for runtime reprocessing of literature.

Paper and detailed interpretation:https://go.hyper.ai/KyWe0

Idea2Story Framework Example

This dataset was used to train Idea2Story. The system utilizes the paper-review method to describe and evaluate the contribution of learning research, and supports the retrieval and combination of reusable methodological patterns, rather than domain-specific content.

5. Insight Agents: An LLM-Based Multi-Agent System for Data Insights

Amazon researchers have proposed Insight Agents (IA), a multi-agent system based on a large language model. It adopts a "plan-execute" architecture, equipped with hierarchical agents and an OOD-aware routing mechanism, enabling US Amazon sellers to obtain accurate business insights within 15 seconds, with a human assessment accuracy of 90%.

Paper and detailed interpretation:https://go.hyper.ai/LbaHD

Insight Agents (IA) Architecture Example

The authors used a carefully selected dataset for training and evaluating the OOD detection and agent routing model, which contains a total of 301 questions: 178 in-domain questions and 123 out-of-domain questions. A benchmark set containing 100 popular questions with real answers was also provided for end-to-end evaluation.

Dataset

The above is all the content of this week’s paper recommendation. For more cutting-edge AI research papers, please visit the “Latest Papers” section of hyper.ai’s official website.

We also welcome research teams to submit high-quality results and papers to us. Those interested can add the NeuroStar WeChat (WeChat ID: Hyperai01).

See you next week!