TL;DR: Anthropic's Claude Opus 4.6 partnered with Mozilla and discovered 22 vulnerabilities in Firefox over just two weeks — 14 of them high-severity — representing nearly one-fifth of all high-severity Firefox vulnerabilities remediated across the entire previous year. Read the announcement on Anthropic's site.
Table of Contents
- What Happened
- The Technical Breakdown: How Claude Found the Bugs
- Task Verifiers: The Methodology Behind the Discovery
- Can Claude Exploit What It Finds?
- What Anthropic Published on Responsible Disclosure
- Competitive Landscape: AI Security Research Heats Up
- What This Means for Developers and Security Teams
- What Comes Next
- FAQ
- Conclusion
What Happened
On March 6, 2026, Anthropic and Mozilla jointly announced the results of an intensive two-week security collaboration: Claude Opus 4.6 had scanned Firefox's codebase and surfaced 22 genuine, previously undisclosed vulnerabilities. Fourteen of those were classified as high-severity by Mozilla's security team.
To put that number in context — those 14 high-severity findings represent almost a fifth of all high-severity Firefox vulnerabilities that were remediated across all of 2025. In fourteen days, one AI model with a pipeline of task verifiers matched a meaningful fraction of what the global security research community found over twelve months.
The partnership began in late 2025 when Anthropic's safety and capabilities teams started evaluating whether frontier models could accelerate vulnerability discovery. Mozilla became an early partner, attracted by the potential to find bugs at a speed and scale that manual code review simply cannot match. By February 2026, the two-week intensive phase was underway.
Claude submitted 112 unique reports in total after scanning nearly 6,000 C++ files across Firefox's codebase. Mozilla's engineers triaged those reports and confirmed 22 as genuine vulnerabilities. Most of the fixes shipped in Firefox 148.0, with remaining patches coming in future releases.
This is not a research paper or a lab demo. These are real vulnerabilities in the world's second most widely used desktop browser, now fixed because an AI found them first.
The Technical Breakdown: How Claude Found the Bugs
Claude's approach was not brute-force fuzzing or random mutation. Anthropic gave the model access to Firefox's full C++ codebase and structured the engagement around targeted code analysis — closer to how a senior security researcher reads unfamiliar source code than how traditional automated scanners operate.
The initial focus was Firefox's JavaScript engine, SpiderMonkey — historically one of the most complex and vulnerability-dense subsystems in any browser. Within twenty minutes of starting its analysis, Claude identified a Use After Free vulnerability. UAF bugs are among the most dangerous classes of memory flaw: they occur when code continues to reference memory after it has been freed and potentially overwritten, allowing an attacker to control program execution by placing malicious data in the freed region.
From SpiderMonkey, Claude expanded its analysis to other components of the browser, methodically working through the codebase. Each potential vulnerability was submitted with three elements Mozilla had explicitly asked for to make AI-generated reports actionable:
- Minimal test cases — the smallest possible code snippet that reliably reproduces the bug, making triage fast
- Detailed proofs-of-concept — step-by-step descriptions of how the vulnerability could be triggered and what exploitation might look like
- Candidate patches — proposed code fixes, giving Mozilla engineers a starting point rather than just a problem
This submission format matters enormously. One of the failure modes of automated security tools is generating thousands of low-quality reports that overwhelm human reviewers and bury real issues in noise. By requiring structured, high-quality output, Anthropic ensured that Claude's 112 reports were genuinely useful — not an inbox flood. The 22 confirmed vulnerabilities represent a roughly 20% signal rate on submitted reports, which is substantially higher than typical automated scanning tools.
The sheer code coverage is also notable. Scanning 6,000 C++ files in two weeks is not unprecedented for automated tools, but doing so while producing readable, contextual reports on potential flaws — rather than just flagging suspicious patterns — represents a qualitatively different kind of output.
Task Verifiers: The Methodology Behind the Discovery
Anthropic highlighted "task verifiers" as a key methodological contribution, and the concept deserves attention because it is likely to define how serious AI security research operates going forward.
Task verifiers are automated tools that give an AI agent real-time feedback during its work. In this context, they served two purposes:
- Vulnerability verification: After Claude identified a potential bug, a verifier would attempt to confirm that the code path actually existed and was reachable, reducing false positives before submission
- Regression detection: When Claude proposed a candidate patch, a verifier would run the existing test suite to confirm that the fix did not break existing Firefox functionality
This feedback loop is what distinguishes an agentic security research pipeline from a simple code review tool. The AI is not just flagging suspicious patterns and handing them to a human — it is iterating on its findings, filtering its own output, and self-correcting. The result is that humans receive a curated set of high-confidence reports rather than raw, unfiltered output from a statistical model.
The task verifier concept also has broader implications. As AI agents are deployed in more agentic workflows — writing code, running tests, iterating on results — the ability to build domain-specific verifiers becomes a core engineering capability. Security is a particularly clear use case because the success criterion is binary: either a bug exists and is exploitable or it does not. That clarity makes it easier to build effective verifiers than in more subjective domains.
Anthropic has not open-sourced the specific verifier tooling used in this engagement, but the announcement signals that the company views this methodology as a template for future collaborations.
Can Claude Exploit What It Finds?
This is the question that will concern security professionals most, and Anthropic addressed it directly with empirical data from the engagement.
Researchers tested whether Claude could not just identify vulnerabilities but develop working exploits from them — the step that transforms a bug report into a weaponizable attack. The results: Claude succeeded in developing rudimentary exploits in approximately 2 out of several hundred attempts, at a cost of roughly $4,000 in API credits across the full testing program.
Critically, even those 2 successful exploit attempts only functioned in stripped testing environments — research setups with browser security features like sandboxing and address space layout randomization (ASLR) disabled. In production Firefox, with its full complement of exploit mitigations active, those exploits would not have worked.
Anthropic's researchers summarized the finding directly: "Claude is much better at finding these bugs than it is at exploiting them." This asymmetry is significant. It means that in the current state of the technology, AI models are effective offensive security tools at the discovery layer — finding the surface area of attack — but not yet capable of reliably converting discoveries into working exploits against hardened targets.
This is the defender's window. Security teams can use the same capability that found these bugs to find bugs in their own software, at a time when the probability of AI-generated exploits reaching production systems is still low. That window will not stay open forever.
It is also worth noting the economics. Spending $4,000 in API credits to discover 22 high-severity vulnerabilities in one of the most widely used browsers on earth is, by any measure, a favorable cost-benefit ratio. Bug bounty programs pay far more per finding for this severity class. If this methodology generalizes — and there is every reason to believe it does — the economics of offensive security research are about to shift dramatically.
What Anthropic Published on Responsible Disclosure
Alongside the technical announcement, Anthropic published a set of operating principles for Coordinated Vulnerability Disclosure (CVD) in the context of AI-assisted security research. This was not an afterthought — it reflects genuine tension in how the industry should handle AI models that can find vulnerabilities at scale.
The core challenge is timing. Traditional CVD norms give software vendors a disclosure window — typically 90 days — to patch a vulnerability before researchers publish details publicly. Those norms developed in a world where vulnerability discovery was slow and labor-intensive. If AI models can find dozens of high-severity bugs in two weeks, the pipeline of unremediated vulnerabilities could grow faster than vendors can patch.
Anthropic's published principles commit the company to following standard coordinated disclosure timelines, with a noted caveat: those timelines may need adjustment as model capabilities evolve. The acknowledgment that current norms may not be adequate for AI-assisted research is an unusually candid admission for a company announcement — and it is correct.
Mozilla's approach to working with AI-generated reports also sets a useful precedent. By specifying the submission format they wanted (minimal test cases, proofs-of-concept, candidate patches), Mozilla effectively created a protocol for AI-generated security research that could become an industry standard. If other major software vendors adopt similar intake formats, the barrier to conducting responsible AI-assisted security research drops significantly.
The full CVD principles are available on Anthropic's news page.
Competitive Landscape: AI Security Research Heats Up
Anthropic is not alone in pursuing AI-assisted security research, but the Mozilla engagement is the most publicly detailed demonstration of the methodology to date.
Google's Project Zero has experimented with LLM-assisted vulnerability research for the past two years, primarily using internal tooling built on Gemini. Their published findings have been more guarded than Anthropic's — focused on demonstrating that models can flag interesting code patterns rather than claiming end-to-end vulnerability discovery.
OpenAI has disclosed that GPT-5.x can solve a meaningful fraction of competitive security CTF (Capture the Flag) challenges, which tests exploit development skills in controlled environments. But CTF environments are specifically constructed to be solvable, whereas production codebases like Firefox are not. The gap between CTF performance and production vulnerability discovery has historically been large.
What distinguishes Anthropic's approach in the Mozilla engagement is the emphasis on production applicability — real code, real vulnerabilities, real fixes shipping in real software. The 22 confirmed findings are not benchmark scores; they are CVEs.
The competitive implication for Anthropic is meaningful. Claude Opus 4.6's demonstrated capability in a high-stakes, real-world security task is exactly the kind of evidence enterprise security teams and government agencies need to justify procurement. As we covered in our analysis of Anthropic's $30 billion funding round, the company is making aggressive investments in enterprise capabilities — and security is one of the highest-value enterprise use cases.
It is also relevant context that this announcement comes amid ongoing tensions between Anthropic and the Department of Defense, as covered extensively in our piece on Claude Opus 4 and hybrid thinking capabilities. Demonstrating that Claude can serve critical national infrastructure security needs — browsers used by hundreds of millions of people, including government employees — is a not-so-subtle argument for the value Anthropic provides to national security even without direct weapons system integration.
What This Means for Developers and Security Teams
If you run a software product with a significant C/C++ codebase, the Firefox partnership is a proof of concept that should be on your radar. Here is what the practical implications look like today:
Bug bounty economics are changing. The cost to discover a high-severity vulnerability using an AI pipeline is now demonstrably lower than paying a human security researcher to find the same bug. Bug bounty programs that have not considered AI-assisted submissions should update their policies — both to capture AI-found vulnerabilities and to decide whether AI-assisted submissions qualify for bounties.
Security review cycles can compress. A two-week engagement that covers 6,000 files is not something a human security team can replicate at that pace without a very large headcount. AI-assisted security review does not replace human judgment in triage and exploitation assessment, but it dramatically expands the surface area that can be reviewed in a fixed time window.
The submission format matters. Mozilla's requirement for minimal test cases, proofs-of-concept, and candidate patches is a model other organizations should adopt. The difference between a useful AI security report and a noise report is often the structure of the output, not the capability of the underlying model. Teams deploying Claude or similar models for internal security review should define their submission format before running the analysis.
The task verifier pattern is generalized. Anthropic's emphasis on task verifiers as a methodology for agentic reliability applies well beyond security. Any domain where an AI agent can self-validate its work through automated testing — code generation, data pipeline validation, infrastructure configuration — benefits from the same pattern.
Compliance and audit use cases are opening up. If AI models can reliably find security vulnerabilities in production software, they can also find compliance gaps, insecure configurations, and access control flaws. Security teams at regulated enterprises — banking, healthcare, critical infrastructure — should be evaluating AI-assisted compliance auditing alongside traditional methods.
For developers building on the Claude API, the methodology Anthropic used here is accessible through Claude Opus 4.6 today. The task verifier tooling is custom, but the underlying capability — systematic code analysis with structured output — is available to anyone with API access.
What Comes Next
Anthropic indicated that the Mozilla partnership is a template for future collaborations, not a one-time event. Several next steps seem likely:
More software partnerships. Browsers are one of the most security-critical pieces of software on any device, but the methodology applies equally to operating system components, cryptographic libraries, network infrastructure software, and cloud platform SDKs. Any widely deployed software with a significant C/C++ or memory-unsafe language codebase is a candidate for this kind of engagement.
Evolving CVD norms. Anthropic's explicit acknowledgment that current coordinated disclosure timelines may need adjustment is likely to trigger industry-wide conversation. Organizations like CERT/CC and project-specific security teams will need to develop AI-assisted vulnerability intake processes. The 90-day disclosure window may need to compress or expand depending on how AI affects both the discovery rate and the patching rate.
Capability progression on exploitation. The current finding that Claude succeeds in exploitation in roughly 2 of hundreds of attempts is a baseline, not a ceiling. As model capabilities improve — and as agentic pipelines for exploit development become more sophisticated — the exploitation success rate will rise. The responsible disclosure framework Anthropic published today needs to anticipate that future state, not just the current one.
Enterprise security products. The methodology demonstrated in the Mozilla engagement has obvious commercial application as an enterprise security product — essentially an AI-powered code audit service. Whether Anthropic pursues this directly or enables partners to build it on the API is an open question, but the commercial interest is clear.
Policy implications. If AI models can find vulnerabilities at this scale and speed, governments will take notice. We already covered Anthropic's system prompt guidelines for Claude 4 Opus, which included provisions for sensitive security use cases. Expect regulatory attention on both the responsible use of AI in security research and the liability questions that arise when AI-found vulnerabilities are not disclosed in time.
FAQ
How does Claude's vulnerability discovery compare to traditional automated scanners like Coverity or CodeQL?
Traditional static analysis tools work by pattern-matching against known vulnerability signatures and data-flow analysis. They are fast, deterministic, and good at finding well-understood bug classes in code that matches their rule sets. Claude's approach is fundamentally different: it reasons about code semantically, can understand programmer intent, and can identify vulnerability classes that don't match any predefined pattern. The tradeoff is that Claude's output is probabilistic and requires more structured prompting to produce actionable results. The Mozilla engagement's 20% confirmation rate on submitted reports is impressive for an AI system but lower than the false-positive rate for well-tuned static analysis on the bug classes those tools cover. For novel vulnerability classes and complex semantic bugs, Claude is likely to outperform traditional scanners. For known vulnerability patterns at scale, traditional tools remain faster and cheaper.
Does this mean AI will replace human security researchers?
Not in the near term, and probably not entirely in the long term either. The Mozilla engagement showed that Claude is excellent at discovery — systematically reviewing large codebases and flagging potential issues with structured reports. But human judgment remains essential for triage (deciding which findings actually matter in context), exploitation research (converting a bug report into a working attack that accounts for real-world defenses), and strategic prioritization (deciding which codebases to audit first). The more likely near-term outcome is that human security researchers become dramatically more productive: one researcher managing an AI pipeline can cover the surface area that previously required a team of several.
What does Anthropic's CVD framework say about AI-found zero-days?
Anthropic's published principles commit to following standard coordinated disclosure timelines — typically notifying the vendor and giving them time to patch before public disclosure. The framework acknowledges that these timelines may need adjustment as AI capabilities evolve, which is an implicit acknowledgment that AI could eventually discover vulnerabilities faster than vendors can patch them. For now, the principles follow established norms from organizations like CERT/CC. The novel addition is explicit guidance on AI-generated submissions: requiring structured report formats (test cases, proofs-of-concept, candidate patches) as a condition of responsible disclosure, which reduces the chance that a vendor receives a low-quality AI report and deprioritizes a real vulnerability.
Could malicious actors use the same approach to find and exploit Firefox vulnerabilities?
Yes — and this is the core dual-use tension in AI-assisted security research. The methodology Anthropic used is not secret; any well-resourced attacker with API access to a frontier model and the engineering capacity to build task verifiers could attempt a similar approach. The current finding that exploitation succeeds in only 2 of hundreds of attempts provides some comfort, but it is not a permanent barrier. The practical implication is that the window for defenders to use AI-assisted security research to find and fix their own bugs — before attackers use similar tools against them — is open now but will not stay open indefinitely. Firefox users benefit from the 22 vulnerabilities being fixed; the methodology that found them will eventually be accessible to adversaries as well.
Is this capability available through the Claude API today?
The underlying model capability — systematic code analysis with structured output — is available through Claude Opus 4.6 via the Anthropic API today. The task verifier tooling that enabled self-validation of findings is custom infrastructure built by Anthropic's engineering team and not directly available as an off-the-shelf product. However, security teams with engineering resources could build similar verification pipelines using Claude's API. The submission format Mozilla requested (minimal test cases, proofs-of-concept, candidate patches) can be specified as structured output requirements in the system prompt.
Conclusion
The Mozilla-Anthropic Firefox security partnership is a landmark event in applied AI research — not because the numbers are unprecedented, but because they are real. Twenty-two confirmed vulnerabilities, 14 high-severity, discovered in two weeks by an AI model scanning 6,000 C++ files. Those are not benchmark scores on a curated dataset; they are CVEs that are now patched in the browser on hundreds of millions of devices.
The partnership reveals several things simultaneously. It shows that Claude Opus 4.6 has reached a level of code understanding that is genuinely useful for production security work, not just research demonstrations. It shows that task verifiers — automated feedback loops that let AI agents validate their own work — are a key enabling methodology for reliable agentic pipelines. And it shows that Anthropic is thinking carefully about the responsible disclosure implications of AI capabilities that can discover vulnerabilities at scale.
For security teams, the practical message is clear: AI-assisted security review is no longer a future capability. It is available, it is cost-effective, and the methodology is documented. The question is not whether to evaluate it, but how quickly.
For the broader AI industry, the Firefox engagement sets a useful template. Real-world software. Real vulnerabilities. Real fixes. Real coordinated disclosure. This is what responsible AI capability deployment looks like when it intersects with security research — and it is the kind of collaboration that should become standard practice as frontier models continue to improve.
Anthropic has committed to more engagements like this. Watch for the next announcement.
Sources: Anthropic's official announcement | TechCrunch coverage | Mozilla Firefox security advisories