Study Finds AI Coding Tools May Slow Down Experienced Developers
AI coding tools, such as Cursor and GitHub Copilot, have been heralded for their potential to revolutionize software development by automating tasks like writing code, fixing bugs, and testing changes. These tools leverage advanced AI models from companies like OpenAI, Google DeepMind, Anthropic, and xAI, which have demonstrated remarkable improvements in software engineering tasks over the past few years. However, a new study from the non-profit AI research group METR, published on Thursday, challenges the notion that these tools uniformly boost productivity among experienced developers. METR conducted a randomized controlled trial involving 16 seasoned open-source developers tasked with completing 246 real tasks on large code repositories they frequently contribute to. Roughly half of these tasks allowed the use of state-of-the-art AI coding tools, primarily Cursor Pro, while the other half prohibited their use. Prior to the study, the developers predicted that AI coding tools would cut their task completion times by 24%. However, the results were contrary to their expectations: developers were 19% slower when using AI tools. Only 56% of the participants had prior experience with Cursor, though all were familiar with web-based large language models (LLMs) to some extent. The researchers provided training on Cursor to ensure the developers were comfortable with the tool before the trial began. METR's findings highlight several potential reasons for the productivity slowdown. One key factor is the time spent on interactions with AI—prompting the system and waiting for responses—rather than actively coding. Another issue is that AI often performs less effectively in large, complex codebases, similar to those used in the study. Despite these results, the study’s authors caution against generalizing their findings. They acknowledge that other large-scale studies have shown productivity gains with AI coding tools. Additionally, AI technology is advancing rapidly, and the researchers suggest that the same study conducted just a few months later might yield different outcomes. “We do not believe that AI systems currently fail to enhance many or most software developers' workflows," the authors noted. "However, our findings suggest that developers should not assume that AI coding tools will immediately accelerate their work." The study also aligns with previous research indicating that AI coding tools can introduce errors and security vulnerabilities. This adds to the growing body of evidence that developers should approach such tools with a critical eye and integrate them thoughtfully into their workflows. In summary, while AI coding tools hold promise, this study underscores the need for further research and caution before assuming they will universally improve developer productivity, especially in complex environments.