HyperAI超神经

MIT researchers have developed a new AI assistant called CodeSteer, which helps large language models (LLMs) decide when to use text and when to use code to solve problems. LLMs, despite their proficiency in textual reasoning, often falter when dealing with computational tasks, such as basic math problems. CodeSteer, a smaller, lightweight LLM, acts as a coach, guiding the larger model to switch between text and code generation until it finds the correct solution. The process involves CodeSteer reviewing the initial query and determining the appropriate method—text or code—to address it. It then generates a prompt for the larger LLM, directing it to use either a coding approach or textual reasoning. The larger model responds to the prompt, and CodeSteer evaluates the output. If the answer is incorrect, CodeSteer continues to provide iterative prompts, suggesting adjustments like adding a search algorithm or optimizing code, until the solution is deemed correct. A symbolic checker and a self-answer checker are also integrated to ensure the code is both accurate and efficient. The researchers created a dataset called SymBench, which consists of 37 complex symbolic tasks, including spatial reasoning, mathematics, order reasoning, and optimization. Using this dataset, they fine-tuned CodeSteer to maximize its performance. In their experiments, CodeSteer outperformed nine baseline methods, increasing average accuracy from 53.3% to 86.4%. Remarkably, a general-purpose model paired with CodeSteer achieved higher accuracy on complex tasks than state-of-the-art models specifically designed for such tasks, while using less computational power. CodeSteer's impact extends beyond simple computational tasks. It enhances LLMs' problem-solving capabilities for more complex scenarios, such as generating robot movement paths in uncertain environments or optimizing logistics in international supply chains. The researchers aim to further refine CodeSteer to make its iterative prompting process faster and to explore the fine-tuning of a unified model that can seamlessly switch between text and code. Industry experts praise the innovation. Jinsung Yoon, a staff research scientist at Google Cloud AI, notes the elegance and impact of CodeSteer's solution to the challenge of tool utilization in LLMs. Chi Wang, a senior staff scientist at Google DeepMind, highlights the strategic guidance provided by the smaller, specialized model, which paves the way for more robust and versatile AI applications in real-world scenarios. MIT's interdisciplinary approach, combining expertise from aeronautics and the Laboratory for Information and Decision Systems (LIDS), has produced a method that leverages existing LLM strengths while addressing their weaknesses. This could revolutionize how LLMs are used, making them more reliable and efficient for a wider range of tasks. The research will be presented at the International Conference on Machine Learning and is available on the arXiv preprint server. CodeSteer represents a significant step forward in AI development, showcasing the potential of using smaller, specialized models to augment and guide larger, more general models. This innovative approach not only improves accuracy but also reduces the computational demands on state-of-the-art LLMs, making them more practical for everyday use. The researchers' ongoing efforts to optimize and expand CodeSteer’s capabilities promise even greater advancements in the field of AI and machine learning.

MIT Researchers Develop AI 'Coach' to Enhance Language Models' Problem-Solving Skills by Guiding Text and Code Generation

Related Links