Command Palette
Search for a command to run...
LLMs 를 위한 Y-Combinator: λ-Calculus 를 통한 Long-Context Rot 해결
LLMs 를 위한 Y-Combinator: λ-Calculus 를 통한 Long-Context Rot 해결
Amartya Roy Rasul Tutunov Xiaotong Ji Matthieu Zimmer Haitham Bou-Ammar
초록
대규모 언어 모델(LLMs) 이 범용 추론기로 점점 더 널리 활용되고 있으나, 긴 입력 처리는 고정된 컨텍스트 윈도우로 인해 병목 현상에 직면해 있습니다. 재귀적 언어 모델(Recursive Language Models, RLMs) 은 프롬프트를 외부화하고 하위 문제를 재귀적으로 해결함으로써 이러한 한계를 극복하고자 합니다. 그러나 기존 RLM 은 모델이 임의의 제어 코드를 생성하는 오픈 엔디드(read-eval-print loop, REPL) 에 의존하고 있어, 실행의 검증, 예측, 분석이 어렵습니다.본고에서는 λ-RLM 을 제안합니다. λ-RLM 은 긴 컨텍스트 추론을 위한 프레임워크로, 자유 형식의 재귀적 코드 생성 대신 λ-계산(λ-calculus) 에 기반한 타입화된 함수형 런타임을 도입합니다. 이 프레임워크는 사전에 검증된 컴비네이터(combinators) 의 컴팩트한 라이브러리를 실행하며, 신경망 추론(inference) 은 경계가 설정된 리프(leaf) 하위 문제에 대해서만 수행합니다. 이를 통해 재귀적 추론을 명시적 제어 흐름을 갖춘 구조화된 함수형 프로그램으로 전환합니다.우리의 연구 결과, λ-RLM 은 기존 표준 RLM 에서 결여된 형식적 보장(formal guarantees) 을 제공함을 확인했습니다. 구체적으로 종료성(termination), 닫힌 형식(closed-form) 비용 상한, 재귀 깊이에 따른 제어된 정확도 확장(scaling), 그리고 단순 비용 모델 하에서의 최적 분할 규칙(optimal partition rule) 등이 포함됩니다. 실험적으로 네 가지 긴 컨텍스트 추론 과제와 아홉 가지 베이스 모델에 걸쳐 평가한 결과, λ-RLM 은 36 개의 모델 - 과제 비교 중 29 개에서 표준 RLM 보다 우수한 성능을 보였습니다. 또한 모델 계층 전반에 걸쳐 평균 정확도를 최대 +21.9 포인트 향상시켰으며, 지연 시간(latency) 은 최대 4.1 배 감소시켰습니다. 이러한 결과는 오픈 엔디드 재귀적 코드 생성보다 타입화된 심볼릭 제어(symbolic control) 가 긴 컨텍스트 추론을 위한 더 신뢰성 있고 효율적인 기반을 제공함을 시사합니다.λ-RLM 의 완전한 구현체는 커뮤니티를 위해 다음 주소에서 오픈소스로 제공됩니다: https://github.com/lambda-calculus-LLM/lambda-RLM.
One-sentence Summary
Researchers from IIT Delhi, Huawei Noah's Ark Lab, and UCL introduce λ-RLM, a framework that replaces open-ended recursive code with a typed functional runtime grounded in λ-calculus. This approach provides formal guarantees like termination and significantly improves accuracy and latency for long-context reasoning compared to standard Recursive Language Models.
Key Contributions
- The paper introduces λ-RLM, a framework that replaces free-form recursive code generation with a typed functional runtime grounded in λ-calculus to execute a library of pre-verified combinators. This approach turns recursive reasoning into a structured functional program with explicit control flow, using neural inference only on bounded leaf subproblems.
- Formal guarantees absent from standard Recursive Language Models are established, including termination, closed-form cost bounds, controlled accuracy scaling with recursion depth, and an optimal partition rule under a simple cost model. These properties are achieved by encoding recursion as a fixed-point over a deterministic operator library and enforcing predictable execution through a symbolic planner.
- Empirical evaluation across four long-context reasoning tasks and nine base models demonstrates that the method outperforms standard RLMs in 29 of 36 comparisons while improving average accuracy by up to 21.9 points. The results also show a latency reduction of up to 4.1 times, validating that typed symbolic control provides a more reliable and efficient foundation for long-context reasoning.
Introduction
Large language models face a critical bottleneck when processing inputs that exceed their fixed context windows, often forcing reliance on truncation or sliding windows that discard essential information. While Recursive Language Models (RLMs) attempt to solve this by treating the prompt as an external environment for recursive decomposition, they suffer from significant reliability issues because they require the model to generate arbitrary control code in an open-ended loop. This approach leads to unpredictable execution, potential non-termination, and difficult-to-audit failure modes that are orthogonal to the actual reasoning task.
The authors introduce λ-RLM, a framework that replaces free-form code generation with a typed functional runtime grounded in λ-calculus to enforce structured control flow. By executing a compact library of pre-verified combinators and restricting neural inference to bounded leaf subproblems, the system separates semantic reasoning from structural orchestration. This design provides formal guarantees on termination and cost while empirically outperforming standard RLMs in accuracy and reducing latency by up to 4.1 times across various long-context tasks.
Method
The λ-RLM framework fundamentally restructures long-context reasoning by separating control flow from semantic inference. Instead of relying on an LLM to generate arbitrary code for task decomposition, the system employs a typed functional runtime where recursion is expressed through a fixed library of pre-verified combinators. This design ensures that the execution trace is deterministic, auditable, and guaranteed to terminate, while the base language model is reserved exclusively for solving bounded leaf subproblems. The overall architecture is organized into three distinct layers, as illustrated in the framework diagram below.
The first layer operates on the principles of Symbolic Lambda Calculus. This layer defines a compact combinator library L consisting of deterministic operators such as SPLIT, MAP, FILTER, REDUCE, CROSS, CONCAT, and PEEK. These operators handle the structural manipulation of the prompt and the orchestration of recursive calls without invoking the neural model. By restricting the control interface to this small set of trusted combinators, the system eliminates the open-ended failure modes associated with free-form code generation. The execution logic is formalized as a fixed-point lambda term, where the recursive solver f is defined in terms of itself, allowing for structured decomposition without external naming mechanisms.
The second layer is responsible for Planning and Optimization. Before execution begins, a deterministic planner computes the optimal parameters for the recursive strategy based on the input size n, the model's context window K, and the cost function. Specifically, the planner determines the partition size k∗, the leaf threshold τ∗, and the recursion depth. These values are derived to minimize the total computational cost while satisfying accuracy constraints. The optimal partition size k∗ is calculated as k∗=⌈n⋅cin/c⊕⌉, balancing the cost of leaf invocations against the overhead of composition. The depth is bounded by ⌈logk∗(n/K)⌉, ensuring that the number of recursive steps remains predictable and finite.
The third layer handles Neural β-Reductions at the leaves. This is the only component of the system that introduces uncertainty. When the recursive decomposition reaches a sub-prompt Pi with length ∣Pi∣≤τ∗, the system invokes the base language model M to solve the subproblem directly. The model acts as a bounded oracle, operating within its native context window to ensure high accuracy. The results from these leaf calls are then aggregated back up the recursion tree using the symbolic composition operators defined in the first layer. This separation ensures that the neural model is only used where semantic inference is genuinely required, while the surrounding control flow remains symbolic and efficient.
Experiment
- Comparison of λ-RLM against Direct LLM inference and Normal RLM validates that a restricted, typed functional runtime offers superior reliability and efficiency for long-context reasoning compared to stochastic or single-pass methods.
- Experiments confirm the Scale-Substitution Hypothesis, demonstrating that formal control structures allow weaker models (e.g., 8B) to match or exceed the accuracy of much larger models (e.g., 70B+) that lack structured orchestration.
- The Efficiency and Predictability Hypothesis is supported by findings that replacing multi-turn, stochastic REPL loops with a single deterministic combinator chain reduces latency by 3 to 6 times and significantly lowers execution variance.
- Qualitative analysis shows that the structured approach is most effective for complex tasks requiring quadratic cross-referencing, where it prevents the exponential accuracy decay known as context rot that plagues direct inference.
- Ablation studies reveal that the pre-built combinator library is the primary driver of performance gains, while symbolic operations for merging results further reduce latency by eliminating unnecessary LLM calls during recursion.
- While the fixed combinator library generally outperforms open-ended code generation, the latter retains an advantage in specific scenarios requiring creative, task-specific strategies such as adaptive code generation for repository navigation.