HyperAI

OpenAI has introduced GPT-5.3-Codex, the most advanced agentic coding model to date, combining enhanced coding performance, improved reasoning, and expanded professional knowledge capabilities in a single, faster system. This model represents a significant leap forward, being 25% faster than its predecessor and capable of handling complex, long-running tasks that involve research, tool use, and end-to-end execution. GPT-5.3-Codex is the first model developed with substantial assistance from earlier versions of itself. The Codex team used early iterations to debug training processes, manage deployment workflows, and analyze test results—demonstrating how the model accelerated its own development. This self-improvement capability marks a pivotal shift in AI evolution. The model sets new benchmarks across multiple domains. It achieves state-of-the-art results on SWE-Bench Pro, a rigorous evaluation of real-world software engineering that spans four programming languages and is more contamination-resistant and industry-relevant than previous versions. It also outperforms prior models on Terminal-Bench 2.0, which measures terminal command execution, and shows strong performance on OSWorld and GDPval—benchmarks that assess agentic behavior, computer use, and professional knowledge work. In practical applications, GPT-5.3-Codex built two full web games—version two of a racing game and a diving game—autonomously over millions of tokens, iterating on feedback like “fix the bug” or “improve the game.” The results are production-ready, showcasing the model’s ability to generate complex, functional applications from scratch. The model also improves on user experience by better interpreting vague or simple prompts. For example, when asked to build a landing page, it automatically presents a discounted yearly plan as a monthly price, and creates a dynamic testimonial carousel instead of a single static quote—resulting in a more polished, professional-looking outcome. Beyond coding, GPT-5.3-Codex supports a wide range of tasks across the software lifecycle, including debugging, deployment, monitoring, writing PRDs, editing copy, user research, testing, and metrics analysis. It also excels in non-coding professional work such as creating presentations, spreadsheets, and reports, matching the performance of GPT-5.2 on GDPval, a benchmark measuring performance across 44 occupations. The model demonstrates superior computer use capabilities in OSWorld, where it completes visual desktop tasks with greater accuracy and autonomy than previous models. This reflects its ability to interact with systems, navigate interfaces, and perform multi-step workflows. GPT-5.3-Codex is designed to be highly interactive. Users can guide it in real time, ask questions, suggest changes, and receive regular progress updates—making it feel like a collaborative colleague. This interactivity, combined with faster inference, enhances productivity and reduces the need for back-and-forth clarification. The model’s development has already transformed internal workflows at OpenAI. Researchers and engineers used it to monitor training runs, identify bugs, analyze behavior patterns, and optimize system performance. It helped detect context rendering issues, diagnose low cache hit rates, and dynamically scale GPU resources during traffic spikes. In cybersecurity, GPT-5.3-Codex is the first model classified as High Capability under OpenAI’s Preparedness Framework. It is the first directly trained to identify software vulnerabilities. While not yet proven to automate full cyberattacks, OpenAI is deploying its most comprehensive safety measures to date, including safety training, automated monitoring, trusted access, and threat intelligence pipelines. To support defenders, OpenAI is launching Trusted Access for Cyber, a pilot program for security researchers. It is also expanding the private beta of Aardvark, its security research agent, and partnering with open-source projects like Next.js to offer free codebase scanning. Additionally, OpenAI is committing $10 million in API credits through its Cybersecurity Grant Program to support good-faith research in open source and critical infrastructure. GPT-5.3-Codex is available to users on paid ChatGPT plans via the Codex app, CLI, IDE extension, and web. API access is being prepared with safety in mind. The model runs on NVIDIA GB200 NVL72 systems, reflecting a deep partnership with NVIDIA. With GPT-5.3-Codex, Codex evolves from a code generator into a full-fledged digital collaborator—capable of reasoning, building, and executing complex tasks across technical and professional domains. It marks a major step toward general-purpose AI agents that can work alongside humans to expand what’s possible.

Related Links

Related Links

Related Links

CVEvolve, a Zero-code, self-discovery Scientific Image Processing Algorithm Proposed by Argonne National Laboratory, Possesses full-stack Capabilities Including Coding, Result Self-checking, and Strategy optimization.

CVEvolve, a Zero-code, self-discovery Scientific Image Processing Algorithm Proposed by Argonne National Laboratory, Possesses full-stack Capabilities Including Coding, Result Self-checking, and Strategy optimization.

Command Palette

GPT-5.3-Codex Launches as Most Advanced Agentic Coding Model, Boosting Productivity and Real-World Task Performance

Related Links

Command Palette

GPT-5.3-Codex Launches as Most Advanced Agentic Coding Model, Boosting Productivity and Real-World Task Performance

Related Links

Command Palette

GPT-5.3-Codex Launches as Most Advanced Agentic Coding Model, Boosting Productivity and Real-World Task Performance

Related Links

CVEvolve, a Zero-code, self-discovery Scientific Image Processing Algorithm Proposed by Argonne National Laboratory, Possesses full-stack Capabilities Including Coding, Result Self-checking, and Strategy optimization.

CVEvolve, a Zero-code, self-discovery Scientific Image Processing Algorithm Proposed by Argonne National Laboratory, Possesses full-stack Capabilities Including Coding, Result Self-checking, and Strategy optimization.