OpenAI's ChatGPT Agent Seeks Full Computer Control to Assist Users
OpenAI has launched ChatGPT Agent, a new tool designed to go beyond simple chatbot interactions and execute complex, multi-step tasks on a user's device. Announced on Thursday, ChatGPT Agent operates within a virtual environment, acting independently to navigate websites, open apps, and manage tasks from start to finish. The company showcases several potential use cases, such as reviewing a user's calendar for upcoming meetings and providing news updates, planning and purchasing ingredients for a meal, and creating detailed slide decks by analyzing competitors. ChatGPT Agent utilizes a model developed specifically for this purpose, combining the functionalities of OpenAI's Operator, which can navigate web browsers, and Deep Research, which is adept at multi-step research and analysis. The new tool was trained using reinforcement learning, a process that helps it learn and improve through trial and error, and was developed by a unified team of between 20 and 35 people from both the Operator and Deep Research projects. During a demo for The Verge, ChatGPT Agent demonstrated its capabilities by planning a date night, coordinating with Google Calendar to find free evenings, and cross-referencing OpenTable for restaurant reservations. It also showed how users can add or modify parameters mid-task, such as including another type of restaurant. Another illustration involved generating a research report on the rise of Labubus compared to Beanie Babies, highlighting ChatGPT Agent's versatility and depth. Despite its impressive abilities, early impressions indicate that ChatGPT Agent can be slow. It can take around 15 to 30 minutes to complete tasks, which is significantly longer than what humans might typically achieve. However, OpenAI's product lead, Yash Kumar, and research lead, Isa Fulford, emphasized that users can initiate tasks and then return to them later, thus still saving time. Fulford noted that the tool can automate tedious tasks, such as renewing office parking every Thursday, which she found beneficial. To ensure safety, ChatGPT Agent asks for permission before executing irreversible actions, like sending emails or making reservations. Additionally, financial transactions are currently restricted, and the tool includes a feature called Watch Mode, which requires users to remain on the tab where the agent is operating, particularly on sensitive sites like financial platforms. ChatGPT Agent will initially be available to subscribers of ChatGPT Pro, Plus, and Team plans, accessible by selecting "agent mode" or typing "/agent." Later this summer, the tool will be rolled out to ChatGPT Enterprise and Education users, although there is no timeline for its availability in the European Economic Area and Switzerland. The development of ChatGPT Agent aligns with a growing trend in the AI industry. Tech companies such as Google, Meta, and Amazon have been vocal about their ambitions to create AI agents that can autonomously perform various tasks. Anthropic, for example, launched a similar tool called "Computer Use" in October, designed to interact with computers as a human would. In February 2024, fintech company Klarna announced its AI agent had handled two-thirds of customer service chats in just one month, equivalent to the work of 700 full-time human employees. This success has spurred other major tech firms to focus on AI agents. However, the effectiveness and reliability of these AI agents are still being closely evaluated. Klarna ultimately had to bring back human operators due to the subpar quality of the AI's work. Industry insiders note that while AI agents hold significant promise, they currently face limitations, particularly in handling sensitive or nuanced tasks. The hype around AI agents is evident in strategic hiring practices, with companies like Google bringing in key talent to advance their projects. OpenAI's focus on robust safeguards and incremental rollout underscores their cautious approach to integrating advanced AI capabilities into everyday tasks. As the technology evolves, the potential for AI agents to revolutionize how we manage our digital lives remains a compelling prospect, though challenges in speed and accuracy must be addressed. OpenAI, founded in 2015 by Sam Altman, Elon Musk, and others, is a leading AI research laboratory. Known for its pioneering work in natural language processing, the company has consistently pushed the boundaries of what is possible with AI. The introduction of ChatGPT Agent is another step in their mission to create safe and useful artificial intelligence, reflecting their commitment to advancing the field while prioritizing user safety and ethical considerations. Industry experts are optimistic about the future of AI agents, though they caution that the technology is still in its early stages. The ability to delegate complex tasks to AI could significantly enhance productivity and convenience, but more development is needed to ensure these tools are reliable and fast enough for widespread adoption.