OpenAI’s GPT-5.4 focuses on real work like spreadsheets, documents, and coding

Two days after rolling out GPT-5.3 Instant, OpenAI has announced GPT-5.4, a new artificial intelligence model for ChatGPT, the company’s developer API, and Codex. The model combines recent advances in reasoning, coding, and automated computer use, with the goal of helping users work on complex professional tasks such as analyzing spreadsheets, writing software, or researching information more efficiently.

GPT-5.4 replaces GPT-5.2 Thinking in ChatGPT for paid users and is also available to developers through the API. A higher-performance version called GPT-5.4 Pro is also being offered for workloads that require deeper analysis or more demanding processing.

The new model combines the coding strengths introduced with GPT-5.3-Codex with wider abilities across research, document creation, and workflow automation. OpenAI says the system is better at carrying out real work tasks with fewer prompts or corrections from users.

One big change in ChatGPT is the addition of a visible planning stage during longer responses. GPT-5.4 Thinking can outline the steps it intends to take before finishing the task. This allows users to adjust instructions while the model is working instead of having to restart the request.

This approach ensures the results are what users actually want and does away with the need for repeated back-and-forth messages. The system also maintains context more reliably in longer conversations.

GPT-5.4 for developers

GPT-5.4 introduces built-in computer-use capabilities for developers. It can interact with software environments, websites, and desktop applications through tools that allow it to issue keyboard and mouse commands and/or run automated scripts.

These capabilities let developers build agents capable of carrying out multi-step workflows across different programs. As an example, an agent could read incoming emails, extract attachments, upload files, analyze data, and update a spreadsheet.

The model also supports very large prompts and memory windows, with up to 1M tokens of context in certain developer environments. That allows it to track longer conversations and larger datasets while completing complex tasks.

OpenAI says the system is also more efficient when solving problems. GPT-5.4 reportedly uses fewer tokens to reach answers compared with GPT-5.2, which can reduce costs and response times for developers using the API.

OpenAI’s benchmark tests show improvements across several categories including coding, document analysis, computer navigation, and tool usage. On the GDPval benchmark, which evaluates professional knowledge tasks across 44 occupations, GPT-5.4 matched or exceeded human professionals in 83 percent of comparisons.

The company also tested spreadsheet modeling tasks similar to those performed by junior investment banking analysts. GPT-5.4 scored 87.3 percent, compared with 68.4 percent for GPT-5.2.

Accuracy improvements were also reported in factual reliability tests. OpenAI says GPT-5.4 produces individual factual claims that are 33 percent less likely to be incorrect than GPT-5.2 and reduces overall response errors by 18 percent.

Tool search

The model introduces a system called tool search that helps agents work more efficiently with large libraries of external tools. Instead of loading every tool definition into the prompt, the system retrieves the definition only when needed by the model.

OpenAI says GPT-5.4 includes additional security safeguards and monitoring systems to reduce misuse, particularly in cybersecurity-related contexts.

GPT-5.4 is rolling out across ChatGPT, the API, and Codex now. Paid ChatGPT users can start using GPT-5.4 Thinking now, while developers can access the model through the API under the name gpt-5.4.

What do you think about GPT-5.4 and its focus on real-world work tasks? Let us know in the comments.

GPT-5.4 for developers

Tool search

Related Posts

Google’s search antitrust trial is wrapping up—here’s what we learned

OpenAI and partners are building a massive AI data center in Texas

Snowflake Proposes ExCoT: A Novel AI Framework that Iteratively Optimizes Open-Source LLMs by Combining CoT Reasoning with off-Policy and on-Policy DPO, Relying Solely on Execution Accuracy as Feedback