HyperAIHyperAI

Command Palette

Search for a command to run...

루프 닫기: RPG-Encoder를 통한 유니버설 레포지터리 표현

초록

현재 리포지토리 에이전트는 기존 방법이 의미적 깊이가 부족한 독립적인 API 문서나 종속성 그래프에 의존하기 때문에 분절된 표현으로 인해 추론의 단절을 겪고 있다. 우리는 리포지토리 이해와 생성을 통합된 주기 내에서 서로 반대되는 과정으로 간주한다. 생성은 의도를 구현으로 확장하는 과정이며, 이해는 구현을 다시 의도로 압축하는 과정이다. 이를 해결하기 위해 우리는 리포지토리 계획 그래프(RPG)를 정적 생성 블루프린트에서 고도로 정밀한 통합 표현으로 일반화하는 RPG-Encoder 프레임워크를 제안한다. RPG-Encoder는 세 가지 메커니즘을 통해 추론 루프를 완성한다. (1) 코드의 원시 정보를 상향 추출된 의미적 특징과 코드 종속성의 결합을 통해 RPG에 인코딩하는 방식; (2) 위상 구조를 점진적으로 진화시켜 유지보수 비용을 리포지토리 규모와 분리함으로써 오버헤드를 95.7% 감소시키는 방식; (3) 구조 인식 기반 탐색을 위한 통합 인터페이스로 작동하는 방식이다. 평가 결과, RPG-Encoder는 SWE-bench Verified에서 Acc@5 93.7%로 최첨단 수준의 리포지토리 이해 능력을 입증하였으며, SWE-bench Live Lite에서는 최고의 기준 모델보다 10% 이상 뛰어난 성능을 기록하였다. 이는 복잡한 코드베이스에서 뛰어난 미세한 위치 정확도를 보여주는 결과이다. 더불어, RepoCraft에서 98.5%의 재구성 커버리지를 달성하여 RPG가 원본 코드베이스를 고도로 정밀하게 반영할 수 있는 능력을 입증하며, 의도와 구현 사이의 루프를 완전히 닫는 성과를 이뤘다.

One-sentence Summary

Microsoft Research Asia, UCSD, and Tsinghua University researchers propose RPG-Encoder, a unified framework that transforms the Repository Planning Graph into a bidirectional, high-fidelity representation bridging code comprehension and generation—achieving 93.7% Acc@5 on SWE-bench and 98.5% reconstruction coverage by fusing semantic intent with dependency topology while reducing maintenance costs by 95.7% via incremental updates.

Key Contributions

  • We reframe repository reasoning as a bidirectional cycle between comprehension and generation, introducing the RPG as a unified representation that fuses semantic intent with structural dependencies to close the reasoning gap in existing fragmented methods.
  • RPG-Encoder implements semantic lifting to encode code into RPG nodes and edges, and uses incremental diff-based updates to maintain the graph with 95.7% lower overhead, enabling scalable, drift-free evolution aligned with code changes.
  • Evaluated on SWE-bench Verified (93.7% Acc@5) and RepoCraft (98.5% reconstruction coverage), RPG-Encoder outperforms baselines by over 10% in understanding and 24.3% in reconstruction, proving its high-fidelity, structure-aware utility for complex codebases.

Introduction

The authors leverage the Repository Planning Graph (RPG) — originally designed for code generation — to build RPG-Encoder, a unified representation that bridges repository comprehension and generation as inverse, bidirectional processes. Prior methods suffer from fragmented representations: API docs lack structural context, while dependency graphs miss semantic intent, forcing agents to infer connections or rationale manually — and both incur high maintenance costs due to drift or static updates. RPG-Encoder addresses this by encoding code into a semantically rich, topology-aware RPG; evolving it incrementally via commit diffs to cut maintenance overhead by 95.7%; and using it as a unified interface for structure-aware navigation. It achieves state-of-the-art results on SWE-bench tasks and 98.5% reconstruction coverage on RepoCraft, proving RPG’s fidelity and closing the loop between intent and implementation.

Dataset

  • The authors use six real-world Python repositories selected for popularity and structural complexity to build the RepoCraft benchmark for evaluating automated repository reconstruction.
  • They construct a documentation dataset by compiling source files in each repo’s docs/ directory using Sphinx, converting reStructuredText or Markdown into a unified textual format that captures class, function, and module definitions.
  • The compiled documentation serves as ground-truth specification for baseline agents, mirroring the reference material human developers consult.
  • Across the six repositories, the documentation spans 7,320 files and over 2.5 million tokens, forming a substantial context for testing long-context understanding.
  • No training split, mixture ratios, or cropping strategies are described; the dataset is used as-is for evaluation, with metadata derived directly from compiled documentation nodes including function signatures and parameter descriptions.

Method

The authors leverage the RPG-Encoder to transform a raw codebase into a structured, semantically grounded Repository Planning Graph (RPG), which serves as a unified reasoning substrate for agentic code understanding. The architecture is organized into three core phases: Encoding, Evolution, and Operation, each addressing a distinct lifecycle stage of the codebase representation.

As shown in the figure below, the RPG-Encoder bridges the gap between verbose, implementation-heavy code and high-level functional intent by synthesizing a dual-view graph that integrates both semantic hierarchy and execution dependencies. This enables agents to reason about the repository not just as a collection of files, but as a coherent, navigable system.

The Encoding phase begins with Semantic Lifting, where the system parses the entire codebase to extract atomic semantic features for each function and class. These features are expressed as concise, implementation-agnostic verb-object phrases (e.g., “store basic auth credentials”, “send GET request”) that capture behavioral intent rather than internal logic. This step is performed globally across the repository to ensure consistent granularity and avoid local biases. The extracted features are then aggregated into file-level summaries, establishing functional edges that link files to their constituent functions.

Next, Functional Abstraction reorganizes these low-level features into a hierarchical structure by identifying latent functional centroids (e.g., “DataProcessing”, “ModelTraining”) that serve as high-level nodes. This is achieved through LLM-guided clustering: the model analyzes the semantic features of all file-level nodes to induce abstract categories, then recursively assigns each node to the most semantically compatible parent. To ensure structural stability, intermediate nodes are inserted when direct parent-child relationships lack sufficient granularity.

Artifact Grounding anchors this abstract hierarchy to physical code artifacts. For each high-level node, the system computes its minimal directory scope by aggregating the paths of all descendant leaf nodes and applying a Trie-based branching analysis to extract meaningful, non-redundant directory LCAs. This ensures that abstract functional concepts (e.g., “DataPreprocessing”) are tied to concrete paths (e.g., “sklearn/preprocessing”). Finally, dependency edges are injected via AST analysis, mapping imports, calls, and inheritance relationships to complete the RPG.

The Evolution phase maintains the RPG incrementally in response to codebase changes. For each commit, the system parses the diff to identify affected entities and applies one of three atomic operations: Deletion, Modification, or Addition. Deletions trigger recursive pruning of empty parent nodes to preserve structural hygiene. Modifications are evaluated for semantic drift; if the functional intent shifts beyond a threshold, the node is re-routed to a new domain via deletion and reinsertion. Additions are inserted via top-down semantic routing, where the LLM selects the most appropriate parent node based on feature alignment, ensuring the hierarchy remains semantically coherent.

The Operation phase exposes the RPG as a unified reasoning substrate through three core tools. SearchNode enables intent-based discovery by matching behavioral phrases against semantic features or performing keyword-based snippet search. FetchNode retrieves precise source context and metadata for verified entities, ensuring agents reason on ground-truth code. ExploreRPG facilitates topological traversal along dependency or functional edges, allowing agents to uncover call chains, upstream dependencies, or semantically related modules. Together, these tools enable multi-dimensional navigation that integrates functional intent with physical implementation.

The RPG’s dual-view structure—partitioned by functional and dependency edges but sharing a unified node set—allows seamless context switching during retrieval. This design reduces information overload by serving as both a knowledge source (storing semantic features and metadata) and a process encoder (inducing topological order via edges), exposing the causality and hierarchy essential for architectural comprehension.

Experiment

  • RPG significantly improves repository understanding by enhancing file and function localization, outperforming baselines through combined semantic and topological guidance that filters noise while ensuring comprehensive coverage.
  • RPG serves as a complete representational substrate for repository reconstruction, enabling near-lossless recovery of structure and functionality, with code volume and modularity closely matching human-written projects.
  • Semantic features and topological dependencies are mutually reinforcing: removing either degrades performance, confirming that both are essential for accurate localization and structural fidelity.
  • RPG enables efficient, cost-effective navigation by reducing redundant exploration and concentrating reasoning on relevant code, achieving higher accuracy per unit cost than baselines.
  • Incremental RPG updates maintain high fidelity with minimal computational overhead, making long-term repository evolution sustainable without sacrificing performance.
  • RPG induces structured agent behavior, promoting a “search-then-zoom” pattern that leverages topology for global context before drilling into implementation details, reducing search and scope-related failures.
  • Ablations confirm that hierarchical constraints are critical: removing file or function metadata leads to structural collapse, merging modules or losing granularity, underscoring RPG’s role in preserving architectural intent.

The authors use RPG to enhance repository understanding and reconstruction, demonstrating consistent gains in localization accuracy and structural fidelity across multiple benchmarks and models. Results show that integrating semantic features with topological constraints enables agents to precisely map high-level intent to implementation units while reducing redundant exploration. Ablation studies confirm that both semantic grounding and structural connectivity are mutually reinforcing, with their combined use yielding superior performance over text-based or partial graph baselines.

RPG-Encoder significantly improves cost efficiency in repository understanding tasks, achieving higher accuracy per dollar spent compared to all baselines across both GPT-4.1 and GPT-5. It reduces both the number of reasoning steps and monetary cost while maintaining superior performance, demonstrating that structured navigation enables more focused and economical exploration of codebases.

The authors use ablation experiments to isolate the impact of hierarchical structure on repository reconstruction, showing that removing file or function metadata leads to significant shifts in code organization. Without file-level boundaries, models consolidate features into fewer, denser files; without function metadata, they generate more classes and functions to compensate for lost procedural guidance. These findings confirm that explicit topological signals are essential for preserving modularity and structural fidelity during reconstruction.

The authors use RPG to enhance repository understanding by integrating semantic and topological signals, which improves both file-level and function-level localization accuracy across multiple models. Results show that RPG-guided agents consistently outperform baselines in precision and recall, particularly at the function level, where structural constraints help map high-level intent to specific code units. Ablation studies confirm that both semantic features and dependency graphs are critical, as removing either degrades performance and increases reasoning cost.

RPG-Encoder consistently achieves higher cost-effectiveness across multiple LLMs by reducing both the number of reasoning steps and monetary cost per task, demonstrating that structured navigation enables more efficient exploration than baseline methods. While some baselines like LocAgent show low cost on weaker models, RPG-Encoder maintains superior efficiency on stronger models such as GPT-4.1 and GPT-5, where it balances precision with minimal resource expenditure. This efficiency stems from RPG’s ability to guide agents toward relevant code regions early, minimizing redundant tool calls and context consumption.


AI로 AI 구축

아이디어에서 출시까지 — 무료 AI 코코딩, 즉시 사용 가능한 환경, 최적의 GPU 가격으로 AI 개발을 가속화하세요.

AI 협업 코딩
바로 사용 가능한 GPU
최적의 가격

HyperAI Newsletters

최신 정보 구독하기
한국 시간 매주 월요일 오전 9시 에 이번 주의 최신 업데이트를 메일로 발송합니다
이메일 서비스 제공: MailChimp
루프 닫기: RPG-Encoder를 통한 유니버설 레포지터리 표현 | 문서 | HyperAI초신경