OpenAI releases Codex app for Windows and retires GPT-5.1 m…

TL;DR: OpenAI has launched the Codex app on Windows as of March 4, 2026, doubling rate limits across all paid plans and giving free-tier users access for the first time. Simultaneously, on March 11, 2026, OpenAI discontinued the entire GPT-5.1 model family — including GPT-5.1 Instant, Thinking, and Pro variants — pushing all users to newer alternatives. The two moves together signal a company accelerating its developer tooling roadmap and aggressively clearing the old model catalog.

Codex app arrives on Windows — what's included
Rate limit changes across all subscription tiers
Multi-agent orchestration in Codex explained
Why OpenAI deprecated GPT-5.1 models
What GPT-5.1 users need to do now
Codex vs GitHub Copilot vs Cursor — feature comparison
Enterprise developer workflow implications
The aggressive model roadmap signal
Community reactions and migration concerns
What OpenAI's deprecation pattern tells us about the future

Codex app arrives on Windows — what's included

On March 4, 2026, OpenAI shipped the Codex app to Windows users, completing the cross-platform rollout that began with macOS. The Codex app is a dedicated desktop interface for AI-assisted coding, built differently from ChatGPT's web interface. Where ChatGPT is a general-purpose chat tool, Codex is purpose-built for developers who want an agent that can take on multi-step programming tasks autonomously.

The app is available through the OpenAI website as a direct download for Windows 10 and Windows 11. Installation is straightforward — no terminal commands, no configuration files. For enterprise IT teams that have been waiting to standardize on a single AI coding tool, Windows availability removes one of the last friction points.

The core feature set on Windows matches what macOS users have had access to: a file-aware code editor context, the ability to invoke Codex to write functions, debug errors, generate tests, and refactor code across multiple files simultaneously. The app integrates with local code repositories, reading directory structure and file contents to give the model grounding before generating output.

One notable detail from the release notes: the Windows release coincides with general availability improvements to Codex's context window usage, meaning longer codebases are handled more reliably than they were at macOS launch. Early macOS users had reported context truncation issues when working with larger repositories. The Windows launch appears to have shipped after OpenAI addressed those complaints.

Codex on Windows also supports the full range of OpenAI model options available to a user's subscription tier. Pro subscribers access the most capable reasoning variants. Plus users get the mid-tier models. Free users can now use Codex for the first time, with standard (not doubled) rate limits.

Rate limit changes across all subscription tiers

Alongside the Windows release, OpenAI doubled rate limits for Codex across all paid plans. This is a significant change for heavy users who have been hitting hourly and daily ceilings on complex tasks.

Plan	Previous Codex rate limit	New rate limit
Free	None	Standard access (first access)
Plus	Baseline	2x baseline
Pro	Baseline	2x baseline
Business	Baseline	2x baseline
Enterprise	Baseline	2x baseline
Edu	Baseline	2x baseline

The rate limit increase matters more than it might appear on paper. Codex tasks are not single API calls. A single "write me a REST endpoint with tests" instruction can trigger 10-30 sequential model calls as Codex iterates on code, runs internal validation, and generates documentation. Under the old rate limits, developers running Codex intensively during a coding session would exhaust their daily allocation well before end of business.

Doubling limits effectively doubles the practical utility of the tool for professional developers. For enterprise teams running Codex across dozens of engineers simultaneously, the Business and Enterprise tier increases are operationally meaningful. Previously, large team deployments required careful scheduling to avoid rate limit collisions. With 2x headroom, those constraints ease considerably.

Free-tier access is a strategic move. Giving students and developers without paid subscriptions their first taste of Codex creates a funnel. Developers who build workflows around Codex on free tier will upgrade when they hit the usage ceiling, which they will. OpenAI's historical pattern with ChatGPT free tier suggests this is the intended conversion path.

Multi-agent orchestration in Codex explained

The most technically interesting aspect of Codex is its architecture for running multiple agents in parallel. This is not a marketing description of running multiple chat windows. It is a specific capability built into how Codex manages task execution.

When you give Codex a complex task — say, "refactor the authentication module, write integration tests, update the documentation, and open a pull request" — Codex does not execute these steps sequentially in a single session. Instead, it spawns sub-agents for each major workstream, runs them concurrently, and coordinates their outputs before presenting a unified result.

This design is relevant for long-running tasks where sequential execution would be too slow. A refactor that touches 50 files can be split across multiple agents working on different sections simultaneously, with a coordination layer merging outputs and resolving conflicts. OpenAI describes this as Codex being able to "manage multiple agents in parallel" and handle "long-running tasks" that exceed what a single model call can accomplish.

In practice, this is still maturing. Tasks with complex interdependencies require careful orchestration, and early users have noted that the parallel agents occasionally make conflicting edits that require human review to reconcile. OpenAI's release documentation positions this as expected behavior at current capability levels, with the expectation that coordination quality will improve with future model updates.

The multi-agent architecture positions Codex as infrastructure, not just a productivity tool. The long-term vision is a coding agent that runs autonomously in the background while a developer focuses on higher-level design decisions. That vision is not fully realized yet, but the architecture is clearly being built to support it.

Why OpenAI deprecated GPT-5.1 models

On March 11, 2026, exactly one week after the Codex Windows launch, OpenAI retired the entire GPT-5.1 model family. This includes GPT-5.1 Instant, GPT-5.1 Thinking, and GPT-5.1 Pro.

The GPT-5.1 generation was a transition family — models released between the GPT-5 launch and the upcoming GPT-6 (or equivalent) architecture. They offered incremental improvements over base GPT-5 in specific areas: faster inference for Instant, stronger multi-step reasoning for Thinking, and extended context performance for Pro. None represented a fundamental architectural leap.

According to the ChatGPT release notes, the deprecation was positioned as part of routine model lifecycle management. OpenAI's stated rationale is that maintaining multiple model variants creates infrastructure overhead and user confusion. Consolidating around fewer, more capable models reduces that friction.

The timing against the Codex launch is not coincidental. OpenAI appears to be standardizing the model stack that Codex runs on, and the GPT-5.1 variants were not the target architecture. Deprecating them clears the catalog and pushes developers toward models that align with Codex's internal requirements.

There is also a competitive dimension. GPT-5.1 Instant competed in the same latency tier as Anthropic's Claude Haiku and Google's Gemini Flash. OpenAI's current positioning in that tier has shifted to a different offering, making GPT-5.1 Instant redundant rather than discontinued for quality reasons.

What GPT-5.1 users need to do now

If your application or workflow is using GPT-5.1 Instant, GPT-5.1 Thinking, or GPT-5.1 Pro via the API, those endpoints stopped responding on March 11, 2026. You need to migrate.

OpenAI's recommended migration paths, based on use case:

GPT-5.1 variant	Recommended replacement	Key difference
GPT-5.1 Instant	GPT-4o mini or latest fast variant	Similar latency profile, updated training data
GPT-5.1 Thinking	o3 or o3-mini	Stronger reasoning benchmark performance
GPT-5.1 Pro	GPT-4o or latest GPT-5	Better extended context handling

Migration is primarily a model name swap in your API calls. The input/output format for the Chat Completions API has not changed. System prompts, function calling schemas, and response parsing logic should transfer without modification.

The more significant work is prompt validation. Different model generations respond differently to identical prompts, particularly for structured output tasks. Any production workflow using GPT-5.1 should run regression tests on the replacement model before switching over in production. Edge cases — especially long inputs, multi-turn conversations, and tool-calling chains — are where behavioral differences between model generations are most likely to surface.

Users on ChatGPT's consumer product who selected GPT-5.1 variants as their preferred model will find those options removed from the model selector. OpenAI has automatically migrated these users to the current default model.

Codex vs GitHub Copilot vs Cursor — feature comparison

With the Windows launch, Codex enters more direct competition with GitHub Copilot and Cursor, which are the two most widely adopted AI coding tools among Windows-first developer populations.

Feature	OpenAI Codex	GitHub Copilot	Cursor
Multi-file editing	Yes (agent)	Limited (Copilot Workspace)	Yes
Multi-agent parallel tasks	Yes	No	No
Long-running background tasks	Yes	No	No
IDE integration	Standalone app	VS Code, JetBrains, Neovim	VS Code fork
Repository-aware context	Yes	Yes (with repo indexing)	Yes
Model selection	OpenAI models only	GPT-4o + Claude (via GitHub)	GPT-4o, Claude, Gemini
Free tier	Yes (standard limits)	Limited (60 requests/month)	Yes
Enterprise SSO/audit logs	Yes	Yes	Yes

The table reveals Codex's genuine differentiation: parallel multi-agent task execution and long-running background tasks. No other consumer AI coding tool currently ships this. GitHub Copilot Workspace offers a multi-step task runner, but it is sequential and does not spawn parallel sub-agents. Cursor is powerful for in-editor code generation but does not have an autonomous background task architecture.

Codex's weakness is equally clear: it is a standalone app, not an IDE integration. Developers spend most of their time inside VS Code, JetBrains, or similar editors. Leaving that environment to use Codex in a separate window creates workflow friction. Until OpenAI ships a VS Code extension or IDE plugin that matches the standalone app's capabilities, Codex will remain a secondary tool for most developers rather than a primary one.

Enterprise developer workflow implications

For enterprise engineering teams evaluating AI coding tools, the Codex Windows release changes the calculus in a few specific ways.

First, cross-platform standardization is now possible. Teams with mixed Mac/Windows developer populations previously could not standardize on Codex. That barrier is gone. IT and security teams can now evaluate Codex for company-wide deployment without platform exclusions.

Second, the doubled rate limits for Business and Enterprise plans address the most common objection from team leads who trialed Codex and found it impractical at scale. The previous rate ceilings meant that a 20-person engineering team using Codex simultaneously could exhaust shared limits within hours. The new limits make continuous use during a full workday feasible.

Third, the multi-agent capability maps directly to enterprise use cases that are not well-served by current tools. Sprint-level tasks — implementing a feature from requirements to code to tests to documentation — are precisely the kind of long-horizon, multi-step work that Codex's architecture targets. Enterprise developers who currently break these tasks into separate AI-assisted steps could potentially hand them to Codex as unified jobs.

The security audit trail is still an open question. Enterprise security teams want visibility into what the AI agent is reading, what code it is generating, and what network calls it is making. OpenAI has not published detailed enterprise security documentation for Codex comparable to what GitHub Copilot Enterprise provides. That gap will need to close before risk-averse organizations in financial services, healthcare, or government can approve deployment.

The aggressive model roadmap signal

The simultaneous Codex expansion and GPT-5.1 deprecation, both shipping within a week of each other, reflects an OpenAI that is moving faster through its model catalog than at any prior point.

In 2023, models remained available for 12-18 months after successors launched. In 2024, that window compressed to 6-9 months. The GPT-5.1 family, which launched in mid-2025, was deprecated in March 2026 — a lifecycle of roughly 6-9 months at most.

This acceleration is partly competitive pressure. Anthropic, Google DeepMind, and Meta are all releasing new models on faster cadences than before. OpenAI cannot maintain pricing power or developer mindshare by keeping an aging model catalog active alongside newer options. Cleaning up the catalog forces developers to use current models, which perform better on benchmarks and are cheaper for OpenAI to serve at scale.

The other driver is internal coherence. As Codex becomes a more important product, OpenAI needs its agent infrastructure to run on a consistent, maintained model stack. The GPT-5.1 variants were not part of that stack. Deprecating them eliminates maintenance overhead and lets the infrastructure team focus on the models that matter to the roadmap.

Read alongside the rate limit doubles, the picture is an OpenAI that is investing heavily in making Codex a tool developers actually rely on — not just experiment with. That requires a clean model catalog, capable infrastructure, and limits generous enough to support real workflows. All three moved in the same week.

Community reactions and migration concerns

Developer reactions to the dual announcement split roughly along usage lines.

Heavy Codex users on macOS welcomed the Windows expansion and rate limit increases without reservation. On developer forums and the OpenAI developer Discord, the dominant response was positive — rate limits had been a genuine friction point, and the doubles were described as "finally" by multiple frequent users.

The GPT-5.1 deprecation generated more friction, specifically from API developers who had tuned production workflows to GPT-5.1 Instant. The Instant variant had a latency profile that some developers described as difficult to replicate exactly with current alternatives. A thread on Hacker News included multiple engineers reporting that their latency-sensitive applications (voice interfaces, real-time autocomplete) required several rounds of prompt tuning to achieve equivalent performance on replacement models.

The specific concern with GPT-5.1 Thinking was its reasoning behavior on mathematical and logical tasks. Several developers building tools for technical domains had calibrated around Thinking's specific failure modes. Migrating to o3-mini requires recalibrating around a different set of strengths and weaknesses, which is engineering work that was not planned into their roadmaps.

OpenAI's support channels saw increased traffic around the deprecation date. The company had sent deprecation notices via email and posted the timeline in the release notes approximately 30 days in advance — a window some developers described as insufficient for large production deployments with formal change management processes.

What OpenAI's deprecation pattern tells us about the future

The GPT-5.1 retirement fits into a visible pattern that has implications for every developer building on OpenAI's API.

OpenAI is compressing model lifecycles. The practical consequence for developers is that any production system built on a specific OpenAI model version should be architecturally designed to swap models with minimal friction. Hardcoding model names into application logic is increasingly risky. Environment variables, configuration files, or an abstraction layer that centralizes model selection have become best practices, not optional improvements.

The broader signal is that OpenAI views its model catalog as a product to manage, not a permanent infrastructure commitment. Unlike cloud infrastructure providers that maintain deprecated APIs for years to support enterprise customers, OpenAI is prioritizing catalog cleanliness and infrastructure efficiency over backward compatibility at the model level.

For enterprises evaluating OpenAI as a long-term infrastructure partner, this is a genuine consideration. The model that powers your application today may be deprecated 6-9 months from now. Organizations with formal compliance requirements, lengthy change management processes, or regulated validation workflows need to factor model migration effort into their total cost of ownership calculations.

The counterargument is that newer models consistently outperform deprecated ones, and migration friction is the cost of staying on the performance frontier. For consumer products and startups that can move fast, this is an acceptable tradeoff. For enterprise customers with complex validation requirements, it remains a tension that OpenAI has not fully resolved with its current lifecycle management policies.

The Codex expansion and GPT-5.1 deprecation, read together, describe an OpenAI that is consolidating around a narrower, more capable set of models while building agent infrastructure designed to run on that consolidated stack. Developers who align their workflows with that direction — current models, agent-first patterns, flexibility on model versions — will find OpenAI's tools increasingly powerful. Those who resist the pace of change will find the platform increasingly disruptive.

Frequently asked questions

Is Codex now available on Windows?

Yes. OpenAI released the Codex app for Windows on March 4, 2026. It supports Windows 10 and Windows 11 and is available as a direct download from the OpenAI website.

What plans include Codex access?

Codex is available on all plans including Free (standard limits), Plus, Pro, Business, Enterprise, and Edu. Paid plans received doubled rate limits as of the March 2026 update.

What happened to GPT-5.1?

OpenAI deprecated the entire GPT-5.1 model family on March 11, 2026. GPT-5.1 Instant, GPT-5.1 Thinking, and GPT-5.1 Pro are all discontinued and no longer accessible via the API or ChatGPT interface.

What should I use instead of GPT-5.1 Instant?

OpenAI recommends the latest fast variant in the GPT-4o mini family for latency-sensitive applications that previously used GPT-5.1 Instant.

What should I use instead of GPT-5.1 Thinking?

o3 or o3-mini are the recommended replacements for reasoning-intensive tasks previously handled by GPT-5.1 Thinking.

How does Codex handle multi-agent tasks?

Codex can spawn multiple sub-agents to work on different parts of a task in parallel, then coordinate and merge outputs. This is designed for complex, multi-step programming tasks that benefit from concurrent execution rather than sequential handling.

How does Codex compare to GitHub Copilot?

Codex's key differentiator is parallel multi-agent task execution and long-running background tasks. GitHub Copilot offers deeper IDE integration and supports multiple model providers. Copilot is better for in-editor suggestions; Codex is better for autonomous multi-step tasks.

Will OpenAI release a VS Code extension for Codex?

OpenAI has not announced a VS Code extension for Codex as of March 2026. The current product is a standalone desktop app for macOS and Windows.

How much notice does OpenAI give before deprecating models?

OpenAI typically provides approximately 30 days notice via email and release notes before model deprecation. The GPT-5.1 family followed this pattern with a notice posted roughly 30 days before the March 11, 2026 retirement date.

Is it safe to build production applications on current OpenAI models?

OpenAI is compressing model lifecycle timelines. Best practice is to design applications with model selection as a configuration variable rather than a hardcoded value, making future migrations easier when deprecations occur.

Key takeaways

Codex is now available on Windows with doubled rate limits for all paid plans (Plus, Pro, Business, Enterprise, Edu) and first-ever access for free-tier users.
GPT-5.1 Instant, GPT-5.1 Thinking, and GPT-5.1 Pro were retired on March 11, 2026. API calls to these endpoints now fail.
Codex's multi-agent parallel execution architecture is its primary differentiator versus GitHub Copilot and Cursor, though IDE integration gaps remain a barrier to adoption as a primary tool.
OpenAI is compressing model lifecycles to 6-9 months. Developers should treat model names as configurable variables, not hardcoded constants.
Enterprise deployment of Codex remains constrained by the absence of detailed security documentation and audit trail tooling comparable to Copilot Enterprise.
The combination of rate limit increases, Windows availability, and catalog cleanup signals OpenAI treating Codex as a serious commercial product, not an experimental offering.

Let's Build Something Together

OpenAI releases Codex app for Windows and retires GPT-5.1 model family

Weekly Newsletter