HyperAI

Small language models (SLMs) have long struggled with complex reasoning tasks that require strict adherence to rules, such as solving Sudoku puzzles, writing structured poetry, or planning detailed travel itineraries. While large language models (LLMs) like GPT-4o and OpenAI’s o1 can handle some of these tasks, they are often slow, resource-intensive, and expensive. To address this, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a new framework called DisCIPL—short for Distributional Constraints by Inference Programming with Language Models—that enables small models to perform at a high level by working together under the guidance of a more powerful planner. In this system, a large model acts as a “planner” that designs a strategy for solving a task and then delegates specific parts of the work to smaller, more efficient models—referred to as “followers.” The planner doesn’t just hand off a request; it uses a formal programming language called LLaMPPL, developed by MIT’s Probabilistic Computing Project, to encode precise constraints. For example, a prompt like “write a poem with exactly eight lines, each containing eight words” can be translated into LLaMPPL instructions that guide the smaller models to generate compliant outputs. The framework operates like a team project: the planner breaks down the task, assigns roles, and ensures consistency by reviewing and correcting outputs from the followers. If one model produces a line that doesn’t fit the poetic structure, the planner can swap it with a better option from another model. This collaborative approach allows small models to achieve accuracy and coherence comparable to top-tier reasoning systems like o1, while using far less computational power. In experiments, DisCIPL outperformed both a baseline of small models working alone and GPT-4o on tasks such as writing constrained text, creating budgeted grocery lists, and drafting travel plans. It also matched o1 in accuracy on rule-based tasks, but with significant efficiency gains. The system reduced reasoning length by 40.1% and cut costs by 80.2% compared to o1, thanks in part to the use of small Llama-3.2-1B models as followers—up to 10,000 times cheaper per token than high-end reasoning models. The researchers also found that DisCIPL was more reliable than GPT-4o, which often failed to place required keywords in the correct positions. The follower-only model performed the worst, struggling to follow complex instructions. The work, led by MIT PhD student Gabriel Grand and senior author Jacob Andreas, with contributions from Joshua Tenenbaum, Vikash Mansinghka, and Alex Lew, was presented at the Conference on Language Modeling and IVADO’s workshop on autonomous agents. The team envisions future versions of DisCIPL that are more recursive—where the same model can act as both leader and follower—and that can handle more ambiguous, user-driven preferences, not just hard-coded rules. The project is supported by the MIT Quest for Intelligence, the Siegel Family Foundation, the MIT-IBM Watson AI Lab, the Sloan Research Fellowship, Intel, and several U.S. government agencies, including DARPA, the Air Force Office of Scientific Research, and the National Science Foundation. With its potential to make advanced AI more efficient and accessible, DisCIPL represents a major step toward scalable, high-precision language model systems.

Related Links

Related Links

Related Links

ICLR 2026 | 125x Reduction in Trainable Parameters Per Task! New Method Task Tokens Helps Embodied Intelligence Enhance Complex Task Capabilities

ICLR 2026 | 125x Reduction in Trainable Parameters Per Task! New Method Task Tokens Helps Embodied Intelligence Enhance Complex Task Capabilities

Command Palette

MIT Researchers Develop Efficient Framework to Boost Small AI Models for Complex Reasoning Tasks

Related Links

Command Palette

MIT Researchers Develop Efficient Framework to Boost Small AI Models for Complex Reasoning Tasks

Related Links

Command Palette

MIT Researchers Develop Efficient Framework to Boost Small AI Models for Complex Reasoning Tasks

Related Links

ICLR 2026 | 125x Reduction in Trainable Parameters Per Task! New Method Task Tokens Helps Embodied Intelligence Enhance Complex Task Capabilities

ICLR 2026 | 125x Reduction in Trainable Parameters Per Task! New Method Task Tokens Helps Embodied Intelligence Enhance Complex Task Capabilities