HyperAIHyperAI

Command Palette

Search for a command to run...

Le Y-Combinator pour les LLMs : Résolution de la dégradation du contexte long par le λ-calcul

Amartya Roy Rasul Tutunov Xiaotong Ji Matthieu Zimmer Haitham Bou-Ammar

Résumé

Les modèles de langage de grande taille (LLMs) sont de plus en plus utilisés comme raisonnateurs à usage général, mais les entrées de longue durée restent limitées par une fenêtre de contexte fixe. Les modèles de langage récursifs (RLMs) répondent à ce problème en externalisant le prompt et en résolvant récursivement des sous-problèmes. Toutefois, les RLMs existants reposent sur une boucle de lecture-évaluation-impression (REPL) ouverte, dans laquelle le modèle génère un code de contrôle arbitraire, rendant l'exécution difficile à vérifier, à prédire et à analyser. Nous présentons λ-RLM, un cadre de raisonnement en contexte long qui remplace la génération de code récursif libre par un runtime fonctionnel typé ancré dans le λ-calcul. Ce cadre exécute une bibliothèque compacte de combinateurs pré-vérifiés et n'effectue une inférence neuronale que sur des sous-problèmes feuilles bornés, transformant ainsi le raisonnement récursif en un programme fonctionnel structuré doté d'un flux de contrôle explicite. Nous démontrons que λ-RLM offre des garanties formelles absentes des RLMs standards, notamment la terminaison, des bornes de coût sous forme close, une mise à l'échelle contrôlée de la précision en fonction de la profondeur de récursion, ainsi qu'une règle de partition optimale sous un modèle de coût simple. Sur le plan empirique, à travers quatre tâches de raisonnement en contexte long et neuf modèles de base, λ-RLM surpasse les RLMs standards dans 29 des 36 comparaisons modèle-tâche, améliore la précision moyenne de jusqu'à +21,9 points selon les niveaux de modèles et réduit la latence d'un facteur allant jusqu'à 4,1. Ces résultats montrent qu'un contrôle symbolique typé constitue une base plus fiable et plus efficace pour le raisonnement en contexte long que la génération de code récursif ouverte. La mise en œuvre complète de λ-RLM est ouverte à la communauté à l'adresse suivante : https://github.com/lambda-calculus-LLM/lambda-RLM.

One-sentence Summary

Researchers from IIT Delhi, Huawei Noah's Ark Lab, and UCL introduce λ\lambdaλ-RLM, a framework that replaces open-ended recursive code with a typed functional runtime grounded in λ\lambdaλ-calculus. This approach provides formal guarantees like termination and significantly improves accuracy and latency for long-context reasoning compared to standard Recursive Language Models.

Key Contributions

  • The paper introduces λ\lambdaλ-RLM, a framework that replaces free-form recursive code generation with a typed functional runtime grounded in λ\lambdaλ-calculus to execute a library of pre-verified combinators. This approach turns recursive reasoning into a structured functional program with explicit control flow, using neural inference only on bounded leaf subproblems.
  • Formal guarantees absent from standard Recursive Language Models are established, including termination, closed-form cost bounds, controlled accuracy scaling with recursion depth, and an optimal partition rule under a simple cost model. These properties are achieved by encoding recursion as a fixed-point over a deterministic operator library and enforcing predictable execution through a symbolic planner.
  • Empirical evaluation across four long-context reasoning tasks and nine base models demonstrates that the method outperforms standard RLMs in 29 of 36 comparisons while improving average accuracy by up to 21.9 points. The results also show a latency reduction of up to 4.1 times, validating that typed symbolic control provides a more reliable and efficient foundation for long-context reasoning.

Introduction

Large language models face a critical bottleneck when processing inputs that exceed their fixed context windows, often forcing reliance on truncation or sliding windows that discard essential information. While Recursive Language Models (RLMs) attempt to solve this by treating the prompt as an external environment for recursive decomposition, they suffer from significant reliability issues because they require the model to generate arbitrary control code in an open-ended loop. This approach leads to unpredictable execution, potential non-termination, and difficult-to-audit failure modes that are orthogonal to the actual reasoning task.

The authors introduce λ\lambdaλ-RLM, a framework that replaces free-form code generation with a typed functional runtime grounded in λ\lambdaλ-calculus to enforce structured control flow. By executing a compact library of pre-verified combinators and restricting neural inference to bounded leaf subproblems, the system separates semantic reasoning from structural orchestration. This design provides formal guarantees on termination and cost while empirically outperforming standard RLMs in accuracy and reducing latency by up to 4.1 times across various long-context tasks.

Method

The λ\lambdaλ-RLM framework fundamentally restructures long-context reasoning by separating control flow from semantic inference. Instead of relying on an LLM to generate arbitrary code for task decomposition, the system employs a typed functional runtime where recursion is expressed through a fixed library of pre-verified combinators. This design ensures that the execution trace is deterministic, auditable, and guaranteed to terminate, while the base language model is reserved exclusively for solving bounded leaf subproblems. The overall architecture is organized into three distinct layers, as illustrated in the framework diagram below.

The first layer operates on the principles of Symbolic Lambda Calculus. This layer defines a compact combinator library L\mathcal{L}L consisting of deterministic operators such as SPLIT, MAP, FILTER, REDUCE, CROSS, CONCAT, and PEEK. These operators handle the structural manipulation of the prompt and the orchestration of recursive calls without invoking the neural model. By restricting the control interface to this small set of trusted combinators, the system eliminates the open-ended failure modes associated with free-form code generation. The execution logic is formalized as a fixed-point lambda term, where the recursive solver fff is defined in terms of itself, allowing for structured decomposition without external naming mechanisms.

The second layer is responsible for Planning and Optimization. Before execution begins, a deterministic planner computes the optimal parameters for the recursive strategy based on the input size nnn, the model's context window KKK, and the cost function. Specifically, the planner determines the partition size kk^*k, the leaf threshold τ\tau^*τ, and the recursion depth. These values are derived to minimize the total computational cost while satisfying accuracy constraints. The optimal partition size kk^*k is calculated as k=ncin/ck^* = \lceil \sqrt{n \cdot c_{\text{in}} / c_{\oplus}} \rceilk=ncin/c, balancing the cost of leaf invocations against the overhead of composition. The depth is bounded by logk(n/K)\lceil \log_{k^*}(n/K) \rceillogk(n/K)⌉, ensuring that the number of recursive steps remains predictable and finite.

The third layer handles Neural β\betaβ-Reductions at the leaves. This is the only component of the system that introduces uncertainty. When the recursive decomposition reaches a sub-prompt PiP_iPi with length Piτ|P_i| \leq \tau^*Piτ, the system invokes the base language model M\mathcal{M}M to solve the subproblem directly. The model acts as a bounded oracle, operating within its native context window to ensure high accuracy. The results from these leaf calls are then aggregated back up the recursion tree using the symbolic composition operators defined in the first layer. This separation ensures that the neural model is only used where semantic inference is genuinely required, while the surrounding control flow remains symbolic and efficient.

Experiment

  • Comparison of λ\lambdaλ-RLM against Direct LLM inference and Normal RLM validates that a restricted, typed functional runtime offers superior reliability and efficiency for long-context reasoning compared to stochastic or single-pass methods.
  • Experiments confirm the Scale-Substitution Hypothesis, demonstrating that formal control structures allow weaker models (e.g., 8B) to match or exceed the accuracy of much larger models (e.g., 70B+) that lack structured orchestration.
  • The Efficiency and Predictability Hypothesis is supported by findings that replacing multi-turn, stochastic REPL loops with a single deterministic combinator chain reduces latency by 3 to 6 times and significantly lowers execution variance.
  • Qualitative analysis shows that the structured approach is most effective for complex tasks requiring quadratic cross-referencing, where it prevents the exponential accuracy decay known as context rot that plagues direct inference.
  • Ablation studies reveal that the pre-built combinator library is the primary driver of performance gains, while symbolic operations for merging results further reduce latency by eliminating unnecessary LLM calls during recursion.
  • While the fixed combinator library generally outperforms open-ended code generation, the latter retains an advantage in specific scenarios requiring creative, task-specific strategies such as adaptive code generation for repository navigation.

Créer de l'IA avec l'IA

De l'idée au lancement — accélérez votre développement IA avec le co-codage IA gratuit, un environnement prêt à l'emploi et le meilleur prix pour les GPU.

Codage assisté par IA
GPU prêts à l’emploi
Tarifs les plus avantageux

HyperAI Newsletters

Abonnez-vous à nos dernières mises à jour
Nous vous enverrons les dernières mises à jour de la semaine dans votre boîte de réception à neuf heures chaque lundi matin
Propulsé par MailChimp