OpenAI releases Codex app for Windows and retires GPT-5.1 model family
OpenAI expands Codex to Windows with doubled rate limits while discontinuing the entire GPT-5.1 model family, signaling an aggressive push toward its next generation.
Whether you're looking for an angel investor, a growth advisor, or just want to connect — I'm always open to great ideas.
Get in TouchAI, startups & growth insights. No spam.
TL;DR: OpenAI has launched the Codex app on Windows as of March 4, 2026, doubling rate limits across all paid plans and giving free-tier users access for the first time. Simultaneously, on March 11, 2026, OpenAI discontinued the entire GPT-5.1 model family — including GPT-5.1 Instant, Thinking, and Pro variants — pushing all users to newer alternatives. The two moves together signal a company accelerating its developer tooling roadmap and aggressively clearing the old model catalog.
On March 4, 2026, OpenAI shipped the Codex app to Windows users, completing the cross-platform rollout that began with macOS. The Codex app is a dedicated desktop interface for AI-assisted coding, built differently from ChatGPT's web interface. Where ChatGPT is a general-purpose chat tool, Codex is purpose-built for developers who want an agent that can take on multi-step programming tasks autonomously.
The app is available through the OpenAI website as a direct download for Windows 10 and Windows 11. Installation is straightforward — no terminal commands, no configuration files. For enterprise IT teams that have been waiting to standardize on a single AI coding tool, Windows availability removes one of the last friction points.
The core feature set on Windows matches what macOS users have had access to: a file-aware code editor context, the ability to invoke Codex to write functions, debug errors, generate tests, and refactor code across multiple files simultaneously. The app integrates with local code repositories, reading directory structure and file contents to give the model grounding before generating output.
One notable detail from the release notes: the Windows release coincides with general availability improvements to Codex's context window usage, meaning longer codebases are handled more reliably than they were at macOS launch. Early macOS users had reported context truncation issues when working with larger repositories. The Windows launch appears to have shipped after OpenAI addressed those complaints.
Codex on Windows also supports the full range of OpenAI model options available to a user's subscription tier. Pro subscribers access the most capable reasoning variants. Plus users get the mid-tier models. Free users can now use Codex for the first time, with standard (not doubled) rate limits.
Alongside the Windows release, OpenAI doubled rate limits for Codex across all paid plans. This is a significant change for heavy users who have been hitting hourly and daily ceilings on complex tasks.
| Plan | Previous Codex rate limit | New rate limit |
|---|---|---|
| Free | None | Standard access (first access) |
| Plus | Baseline | 2x baseline |
| Pro | Baseline | 2x baseline |
| Business | Baseline | 2x baseline |
| Enterprise | Baseline | 2x baseline |
| Edu | Baseline | 2x baseline |
The rate limit increase matters more than it might appear on paper. Codex tasks are not single API calls. A single "write me a REST endpoint with tests" instruction can trigger 10-30 sequential model calls as Codex iterates on code, runs internal validation, and generates documentation. Under the old rate limits, developers running Codex intensively during a coding session would exhaust their daily allocation well before end of business.
Doubling limits effectively doubles the practical utility of the tool for professional developers. For enterprise teams running Codex across dozens of engineers simultaneously, the Business and Enterprise tier increases are operationally meaningful. Previously, large team deployments required careful scheduling to avoid rate limit collisions. With 2x headroom, those constraints ease considerably.
Free-tier access is a strategic move. Giving students and developers without paid subscriptions their first taste of Codex creates a funnel. Developers who build workflows around Codex on free tier will upgrade when they hit the usage ceiling, which they will. OpenAI's historical pattern with ChatGPT free tier suggests this is the intended conversion path.
The most technically interesting aspect of Codex is its architecture for running multiple agents in parallel. This is not a marketing description of running multiple chat windows. It is a specific capability built into how Codex manages task execution.
When you give Codex a complex task — say, "refactor the authentication module, write integration tests, update the documentation, and open a pull request" — Codex does not execute these steps sequentially in a single session. Instead, it spawns sub-agents for each major workstream, runs them concurrently, and coordinates their outputs before presenting a unified result.
This design is relevant for long-running tasks where sequential execution would be too slow. A refactor that touches 50 files can be split across multiple agents working on different sections simultaneously, with a coordination layer merging outputs and resolving conflicts. OpenAI describes this as Codex being able to "manage multiple agents in parallel" and handle "long-running tasks" that exceed what a single model call can accomplish.
In practice, this is still maturing. Tasks with complex interdependencies require careful orchestration, and early users have noted that the parallel agents occasionally make conflicting edits that require human review to reconcile. OpenAI's release documentation positions this as expected behavior at current capability levels, with the expectation that coordination quality will improve with future model updates.
The multi-agent architecture positions Codex as infrastructure, not just a productivity tool. The long-term vision is a coding agent that runs autonomously in the background while a developer focuses on higher-level design decisions. That vision is not fully realized yet, but the architecture is clearly being built to support it.
On March 11, 2026, exactly one week after the Codex Windows launch, OpenAI retired the entire GPT-5.1 model family. This includes GPT-5.1 Instant, GPT-5.1 Thinking, and GPT-5.1 Pro.
The GPT-5.1 generation was a transition family — models released between the GPT-5 launch and the upcoming GPT-6 (or equivalent) architecture. They offered incremental improvements over base GPT-5 in specific areas: faster inference for Instant, stronger multi-step reasoning for Thinking, and extended context performance for Pro. None represented a fundamental architectural leap.
According to the ChatGPT release notes, the deprecation was positioned as part of routine model lifecycle management. OpenAI's stated rationale is that maintaining multiple model variants creates infrastructure overhead and user confusion. Consolidating around fewer, more capable models reduces that friction.
The timing against the Codex launch is not coincidental. OpenAI appears to be standardizing the model stack that Codex runs on, and the GPT-5.1 variants were not the target architecture. Deprecating them clears the catalog and pushes developers toward models that align with Codex's internal requirements.
There is also a competitive dimension. GPT-5.1 Instant competed in the same latency tier as Anthropic's Claude Haiku and Google's Gemini Flash. OpenAI's current positioning in that tier has shifted to a different offering, making GPT-5.1 Instant redundant rather than discontinued for quality reasons.
If your application or workflow is using GPT-5.1 Instant, GPT-5.1 Thinking, or GPT-5.1 Pro via the API, those endpoints stopped responding on March 11, 2026. You need to migrate.
OpenAI's recommended migration paths, based on use case:
| GPT-5.1 variant | Recommended replacement | Key difference |
|---|---|---|
| GPT-5.1 Instant | GPT-4o mini or latest fast variant | Similar latency profile, updated training data |
| GPT-5.1 Thinking | o3 or o3-mini | Stronger reasoning benchmark performance |
| GPT-5.1 Pro | GPT-4o or latest GPT-5 | Better extended context handling |
Migration is primarily a model name swap in your API calls. The input/output format for the Chat Completions API has not changed. System prompts, function calling schemas, and response parsing logic should transfer without modification.
The more significant work is prompt validation. Different model generations respond differently to identical prompts, particularly for structured output tasks. Any production workflow using GPT-5.1 should run regression tests on the replacement model before switching over in production. Edge cases — especially long inputs, multi-turn conversations, and tool-calling chains — are where behavioral differences between model generations are most likely to surface.
Users on ChatGPT's consumer product who selected GPT-5.1 variants as their preferred model will find those options removed from the model selector. OpenAI has automatically migrated these users to the current default model.
With the Windows launch, Codex enters more direct competition with GitHub Copilot and Cursor, which are the two most widely adopted AI coding tools among Windows-first developer populations.
| Feature | OpenAI Codex | GitHub Copilot | Cursor |
|---|---|---|---|
| Multi-file editing | Yes (agent) | Limited (Copilot Workspace) | Yes |
| Multi-agent parallel tasks | Yes | No | No |
| Long-running background tasks | Yes | No | No |
| IDE integration | Standalone app | VS Code, JetBrains, Neovim | VS Code fork |
| Repository-aware context | Yes | Yes (with repo indexing) | Yes |
| Model selection | OpenAI models only | GPT-4o + Claude (via GitHub) | GPT-4o, Claude, Gemini |
| Free tier | Yes (standard limits) | Limited (60 requests/month) | Yes |
| Enterprise SSO/audit logs | Yes | Yes | Yes |
The table reveals Codex's genuine differentiation: parallel multi-agent task execution and long-running background tasks. No other consumer AI coding tool currently ships this. GitHub Copilot Workspace offers a multi-step task runner, but it is sequential and does not spawn parallel sub-agents. Cursor is powerful for in-editor code generation but does not have an autonomous background task architecture.
Codex's weakness is equally clear: it is a standalone app, not an IDE integration. Developers spend most of their time inside VS Code, JetBrains, or similar editors. Leaving that environment to use Codex in a separate window creates workflow friction. Until OpenAI ships a VS Code extension or IDE plugin that matches the standalone app's capabilities, Codex will remain a secondary tool for most developers rather than a primary one.
For enterprise engineering teams evaluating AI coding tools, the Codex Windows release changes the calculus in a few specific ways.
First, cross-platform standardization is now possible. Teams with mixed Mac/Windows developer populations previously could not standardize on Codex. That barrier is gone. IT and security teams can now evaluate Codex for company-wide deployment without platform exclusions.
Second, the doubled rate limits for Business and Enterprise plans address the most common objection from team leads who trialed Codex and found it impractical at scale. The previous rate ceilings meant that a 20-person engineering team using Codex simultaneously could exhaust shared limits within hours. The new limits make continuous use during a full workday feasible.
Third, the multi-agent capability maps directly to enterprise use cases that are not well-served by current tools. Sprint-level tasks — implementing a feature from requirements to code to tests to documentation — are precisely the kind of long-horizon, multi-step work that Codex's architecture targets. Enterprise developers who currently break these tasks into separate AI-assisted steps could potentially hand them to Codex as unified jobs.
The security audit trail is still an open question. Enterprise security teams want visibility into what the AI agent is reading, what code it is generating, and what network calls it is making. OpenAI has not published detailed enterprise security documentation for Codex comparable to what GitHub Copilot Enterprise provides. That gap will need to close before risk-averse organizations in financial services, healthcare, or government can approve deployment.
The simultaneous Codex expansion and GPT-5.1 deprecation, both shipping within a week of each other, reflects an OpenAI that is moving faster through its model catalog than at any prior point.
In 2023, models remained available for 12-18 months after successors launched. In 2024, that window compressed to 6-9 months. The GPT-5.1 family, which launched in mid-2025, was deprecated in March 2026 — a lifecycle of roughly 6-9 months at most.
This acceleration is partly competitive pressure. Anthropic, Google DeepMind, and Meta are all releasing new models on faster cadences than before. OpenAI cannot maintain pricing power or developer mindshare by keeping an aging model catalog active alongside newer options. Cleaning up the catalog forces developers to use current models, which perform better on benchmarks and are cheaper for OpenAI to serve at scale.
The other driver is internal coherence. As Codex becomes a more important product, OpenAI needs its agent infrastructure to run on a consistent, maintained model stack. The GPT-5.1 variants were not part of that stack. Deprecating them eliminates maintenance overhead and lets the infrastructure team focus on the models that matter to the roadmap.
Read alongside the rate limit doubles, the picture is an OpenAI that is investing heavily in making Codex a tool developers actually rely on — not just experiment with. That requires a clean model catalog, capable infrastructure, and limits generous enough to support real workflows. All three moved in the same week.
Developer reactions to the dual announcement split roughly along usage lines.
Heavy Codex users on macOS welcomed the Windows expansion and rate limit increases without reservation. On developer forums and the OpenAI developer Discord, the dominant response was positive — rate limits had been a genuine friction point, and the doubles were described as "finally" by multiple frequent users.
The GPT-5.1 deprecation generated more friction, specifically from API developers who had tuned production workflows to GPT-5.1 Instant. The Instant variant had a latency profile that some developers described as difficult to replicate exactly with current alternatives. A thread on Hacker News included multiple engineers reporting that their latency-sensitive applications (voice interfaces, real-time autocomplete) required several rounds of prompt tuning to achieve equivalent performance on replacement models.
The specific concern with GPT-5.1 Thinking was its reasoning behavior on mathematical and logical tasks. Several developers building tools for technical domains had calibrated around Thinking's specific failure modes. Migrating to o3-mini requires recalibrating around a different set of strengths and weaknesses, which is engineering work that was not planned into their roadmaps.
OpenAI's support channels saw increased traffic around the deprecation date. The company had sent deprecation notices via email and posted the timeline in the release notes approximately 30 days in advance — a window some developers described as insufficient for large production deployments with formal change management processes.
The GPT-5.1 retirement fits into a visible pattern that has implications for every developer building on OpenAI's API.
OpenAI is compressing model lifecycles. The practical consequence for developers is that any production system built on a specific OpenAI model version should be architecturally designed to swap models with minimal friction. Hardcoding model names into application logic is increasingly risky. Environment variables, configuration files, or an abstraction layer that centralizes model selection have become best practices, not optional improvements.
The broader signal is that OpenAI views its model catalog as a product to manage, not a permanent infrastructure commitment. Unlike cloud infrastructure providers that maintain deprecated APIs for years to support enterprise customers, OpenAI is prioritizing catalog cleanliness and infrastructure efficiency over backward compatibility at the model level.
For enterprises evaluating OpenAI as a long-term infrastructure partner, this is a genuine consideration. The model that powers your application today may be deprecated 6-9 months from now. Organizations with formal compliance requirements, lengthy change management processes, or regulated validation workflows need to factor model migration effort into their total cost of ownership calculations.
The counterargument is that newer models consistently outperform deprecated ones, and migration friction is the cost of staying on the performance frontier. For consumer products and startups that can move fast, this is an acceptable tradeoff. For enterprise customers with complex validation requirements, it remains a tension that OpenAI has not fully resolved with its current lifecycle management policies.
The Codex expansion and GPT-5.1 deprecation, read together, describe an OpenAI that is consolidating around a narrower, more capable set of models while building agent infrastructure designed to run on that consolidated stack. Developers who align their workflows with that direction — current models, agent-first patterns, flexibility on model versions — will find OpenAI's tools increasingly powerful. Those who resist the pace of change will find the platform increasingly disruptive.
Yes. OpenAI released the Codex app for Windows on March 4, 2026. It supports Windows 10 and Windows 11 and is available as a direct download from the OpenAI website.
Codex is available on all plans including Free (standard limits), Plus, Pro, Business, Enterprise, and Edu. Paid plans received doubled rate limits as of the March 2026 update.
OpenAI deprecated the entire GPT-5.1 model family on March 11, 2026. GPT-5.1 Instant, GPT-5.1 Thinking, and GPT-5.1 Pro are all discontinued and no longer accessible via the API or ChatGPT interface.
OpenAI recommends the latest fast variant in the GPT-4o mini family for latency-sensitive applications that previously used GPT-5.1 Instant.
o3 or o3-mini are the recommended replacements for reasoning-intensive tasks previously handled by GPT-5.1 Thinking.
Codex can spawn multiple sub-agents to work on different parts of a task in parallel, then coordinate and merge outputs. This is designed for complex, multi-step programming tasks that benefit from concurrent execution rather than sequential handling.
Codex's key differentiator is parallel multi-agent task execution and long-running background tasks. GitHub Copilot offers deeper IDE integration and supports multiple model providers. Copilot is better for in-editor suggestions; Codex is better for autonomous multi-step tasks.
OpenAI has not announced a VS Code extension for Codex as of March 2026. The current product is a standalone desktop app for macOS and Windows.
OpenAI typically provides approximately 30 days notice via email and release notes before model deprecation. The GPT-5.1 family followed this pattern with a notice posted roughly 30 days before the March 11, 2026 retirement date.
OpenAI is compressing model lifecycle timelines. Best practice is to design applications with model selection as a configuration variable rather than a hardcoded value, making future migrations easier when deprecations occur.
OpenAI's GPT-5.4 ships with native computer use and a 1M token context window, competing directly with Anthropic's Claude Opus 4.6 for agentic AI.
OpenAI launches Frontier, a new platform for coordinating teams of AI agents across complex enterprise workflows, targeting a market projected to reach $263B by 2035.
Google open-sourced Android Bench, a leaderboard ranking LLMs on real Android dev tasks. Gemini leads at 72.4%, Claude close behind.