Introduction
In a rapid-fire release of new models, OpenAI has introduced its latest programming model, GPT-5.3-Codex, just 15 minutes after Claude Opus 4.6 was launched.

The new model exhibits a notable aesthetic improvement, as demonstrated through two stylish demos: a racing game and a diving game.

GPT-5.3-Codex has reportedly iterated on these games with minimal human intervention, consuming millions of tokens in the process.
In web development, the model not only boasts a more appealing UI but also demonstrates a stronger understanding of user intent. Even when prompts are unclear, it can automatically complete logic to generate a fully functional website.

The model’s computer use capabilities are also enhanced, now assisting finance professionals in creating presentations directly.

It covers various workplace tasks, particularly in knowledge-intensive roles, effortlessly writing documents and creating spreadsheets.

Key Features
The official highlights of GPT-5.3-Codex include:
- Smarter: Achieved 57% on SWE-Bench Pro, 76% on TerminalBench 2.0, and 64% on OSWorld.
- More controllable: Supports real-time guidance during tasks, allowing for adjustments and updates.
- Faster: Requires less than half the tokens of 5.2-Codex for the same tasks, with a speed increase of over 25% per token.
- More capable: Not only excels in coding but also in computer operations.
The following comparison table illustrates the significant improvements across nearly every dimension compared to the previous generation.

The online community has reacted strongly, with users divided into pro-Anthropic and pro-OpenAI camps following these announcements.


Programming Capabilities
The most anticipated aspect remains the programming capabilities. OpenAI claims that GPT-5.3-Codex has achieved state-of-the-art results on SWE-Bench Pro, a benchmark designed for real-world software engineering, covering four programming languages with a higher overall difficulty and richer tasks.

It also shows significant improvements on Terminal-Bench 2.0.

Crucially, GPT-5.3-Codex accomplishes these results with fewer tokens than any previous model.
Computer Use
Another focus of the new Codex is its computer use capabilities. OSWorld is a benchmark for agents in a visual desktop environment, requiring models to complete various productivity tasks. The results indicate that GPT-5.3-Codex significantly outperforms earlier GPT models in this area.

In summary, GPT-5.3-Codex represents not just a breakthrough in specific model capabilities but a comprehensive development in agent-based functionalities, enhancing coding, front-end development, and computer operations.
Interestingly, GPT-5.3-Codex participated in its own training process, marking it as OpenAI’s first model involved in “self-acceleration.” The Codex team utilized its early versions to debug training processes, manage deployments, and evaluate test results.
During the training phase, the research team employed Codex to monitor and debug training tasks, tracking model behavior changes throughout the process and suggesting improvements.
In data analysis, a data scientist collaborated with GPT-5.3-Codex to build a new data pipeline, visualizing results in ways that far exceed traditional dashboard tools. The model extracted key insights from thousands of data points in under three minutes.
The engineering team also leveraged Codex to optimize and adapt the testing and operational framework for GPT-5.3-Codex. When anomalies affecting user experience arose, team members used Codex to identify context rendering defects and traced them back to low cache hit rates.
Additional Developments
In addition to the exciting showdown with Anthropic, OpenAI has two significant initiatives worth noting:
- Frontier: A platform designed to help businesses integrate “AI colleagues” into their workflows.

This initiative aims to facilitate the genuine incorporation of agents into company operations, featuring shared context, onboarding guides, feedback-driven learning, and clear permissions.
Notable companies such as HP, Intuit, Oracle, State Farm, Thermo Fisher, and Uber have already adopted Frontier.
- AI4S: A collaboration between OpenAI and Ginkgo to reduce protein synthesis costs by 40% using GPT-5.

Ginkgo, a synthetic biology lab, has integrated GPT-5 into a self-operating lab, allowing the model to propose experimental designs, execute experiments at scale, learn from results, and determine subsequent steps, effectively completing a closed loop.
2026 could be a pivotal year for the evolution of AI4S.
As OpenAI engages in this competitive landscape with Anthropic, the online community remains abuzz with reactions to these developments, with some users expressing nostalgia for previous models.

To date, there has been no response from OpenAI regarding the discontinuation of the 4o model, perhaps due to their focus on the ongoing competition with Anthropic.

Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.