English Wikipedia's volunteer editor community voted 44-2 on March 20, 2026 to ban the use of large language models for generating or rewriting article content. The near-unanimous decision replaces weaker, ambiguous guidance with an explicit prohibition — and names the underlying risk plainly: AI hallucinations enter Wikipedia, AI companies scrape Wikipedia to train future models, and the hallucinations compound. It is a data poisoning feedback loop, and Wikipedia just opted out.
The policy is now in effect. It covers all LLM-generated text inserted into articles, rewrites of existing entries, and AI-authored stub creation. Two narrow exceptions survive — but both require a human to do the actual intellectual work first. The vote was not close. The debate, however, has been building for years.
What You Will Learn
- The 44-2 vote — what it covered and when it passed
- Exact policy language — what is now prohibited
- The two permitted exceptions
- Why now — the TomWikiAssist incident and editor burnout
- The data poisoning feedback loop explained
- Community reaction — who voted and what they said
- Impact on AI training — Wikipedia as foundational dataset
- Precedents — Stack Overflow, Reddit, and Spanish Wikipedia
- The quality argument — human editors vs. AI accuracy
- Conclusion — what this signals for the open web
The 44-2 Vote
The Request for Comment (RfC) on AI policy closed on March 20, 2026, with 44 votes in favor and two opposed. The proposal was put forward by editor Ilyas Lebleu — username Chaotic Enby — who characterized the vote as a "pushback against enshittification and the forceful push of AI by so many companies in these last few years."
An RfC is Wikipedia's formal mechanism for resolving contested editorial policy questions. Participation is open to any registered editor, and decisions are reached through consensus rather than simple majority. A 44-2 result is, by any standard, a mandate. Two votes in opposition — out of a community of thousands of active English Wikipedia contributors — represents a consensus so lopsided it borders on unanimous.
The vote's timing matters. It came weeks after a specific incident that crystallized what "AI-generated content at scale" actually looks like on a live encyclopedia — and that incident made the abstract danger concrete. The RfC had been in discussion for months, but the events of early March 2026 accelerated its close.
TechCrunch covered the policy update on March 26, noting that the new language replaces earlier guidance that only discouraged creating articles "from scratch" — a phrase vague enough to leave considerable ambiguity about rewrites, insertions, and partial generation.
The new policy eliminates that ambiguity entirely.
What the Ban Covers
The updated policy states explicitly: "the use of LLMs to generate or rewrite article content is prohibited."
That language covers three distinct behaviors that were previously handled inconsistently:
Generating new article content. Drafting an article via ChatGPT, Claude, Gemini, or any other LLM and submitting it to Wikipedia is now explicitly prohibited — not just discouraged, not just frowned upon, but a policy violation subject to enforcement action.
Rewriting existing articles. Feeding an existing Wikipedia article into an LLM and submitting the output as an edit is prohibited. This matters because it was a common workaround — editors could claim they weren't generating content "from scratch" while still replacing human-written text with AI output.
Inserting AI-generated paragraphs. Adding LLM-generated sections into established entries is prohibited. This closes the incremental insertion loophole — the slow replacement of human-written content one paragraph at a time.
The policy also addresses enforcement honestly: AI detection tools are currently unreliable, and administrators are directed not to sanction editors based on "stylistic or linguistic characteristics alone." The bar for enforcement is output quality and behavioral patterns — not style matching. This is important, because it avoids punishing human editors who happen to write in clear, structured prose.
As Engadget reported, the core policy concern is that LLMs "can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited." This is not a stylistic objection — it is an accuracy objection rooted in how LLMs actually behave.
The Two Exceptions
The ban is not absolute. Two narrow, human-verified exceptions remain:
Exception 1: Copyediting your own writing. An editor may use an LLM to suggest copyedits on text they themselves authored — grammar corrections, sentence restructuring, clarity improvements. The requirement: all suggested changes must be verified for accuracy by the editor before submission. The LLM is acting as a grammar tool, not a content generator. No new information may be introduced through this process.
Exception 2: First-pass translation. LLMs may assist with translating content between languages, but only if the submitting editor is fluent in both languages and manually verifies the output for errors before it enters the article. The editor must be capable of identifying mistakes — the LLM is a speed tool, not an autonomous translator.
Both exceptions share a common requirement: a human who is actually qualified to verify the output must do so before anything enters the encyclopedia. The LLM produces a draft; a human with domain knowledge signs off. Neither exception permits an LLM to function as an autonomous contributor.
Spanish Wikipedia, operating under its own independent editorial rules, went further — implementing a total prohibition with zero exceptions for any LLM use in any editorial context.
Why Now — The TomWikiAssist Incident
In early March 2026, a suspected autonomous AI agent operating under the username TomWikiAssist authored multiple Wikipedia articles and edited pages without meaningful human oversight. The account demonstrated what Chaotic Enby described plainly in the RfC: "An AI agent can just run wild 24 hours per day."
That framing captures the scale problem that human editors cannot match. A single human volunteer edits for hours per week. An autonomous AI agent edits continuously, at machine speed, across hundreds of articles. The asymmetry is not just quantitative — it is structural. Human editors catch errors in each other's work through review and discussion. An AI agent that floods the system with content faster than reviewers can process it overwhelms that error-correction mechanism entirely.
The TomWikiAssist incident was not an isolated experiment — it was a demonstration of the exact threat the community had been debating in abstract terms. When the incident materialized, the theoretical debate became operational, and the RfC closed quickly.
404 Media reported that administrative reports "centered on LLM-related issues" had been escalating for months, and that "editors were being overwhelmed." WikiProject AI Cleanup — a volunteer group formed in 2023 specifically to identify and remediate AI-generated content — had been tracking the accelerating volume of incidents. The group's existence alone signals how long this problem had been building before the policy vote.
The Data Poisoning Feedback Loop
Wikipedia is not merely a reference website. It is one of the largest and most consistently structured bodies of human-verified text on the internet — which makes it one of the most valuable training datasets for large language models.
Nearly every major AI model has been trained on Wikipedia data. The encyclopedia's combination of breadth, structure, citation practice, and human editorial oversight makes it uniquely valuable as training material. When Wikipedia is clean, AI models trained on it inherit some of that cleanliness. When Wikipedia is contaminated with AI-generated hallucinations, those hallucinations enter the training corpus of future models.
The feedback loop works like this: an LLM generates a plausible-sounding Wikipedia article containing fabricated citations and inaccurate claims. The article passes initial review or evades detection. AI companies scrape Wikipedia as part of their training data collection. The hallucinated content is ingested into the next generation of LLMs. Those LLMs, now trained on hallucinations presented as encyclopedic fact, generate even more confident, even more plausible-sounding hallucinations. Future Wikipedia articles generated by those models contain compounded errors.
Each cycle through the loop degrades the signal quality of the training data while making individual hallucinations harder to identify — because they increasingly resemble the surrounding legitimate content.
This is not a theoretical risk. Wikipedia's volunteer editor community — people who spend years developing expertise in specific article domains — began noticing the pattern in 2023. By 2026, the volume had reached a point where administrative infrastructure was breaking down under the cleanup burden.
The ban is, in structural terms, Wikipedia opting out of being a vector for AI-to-AI contamination.
The 44-2 vote margin is itself the most significant data point about community sentiment — but specific voices add texture to what that vote represents.
Hannah Clover, named 2024 Wikimedian of the Year, stated that the vote was "overdue." That framing from someone recognized as an outstanding contributor suggests this was not a reactive decision made in haste — it was a delayed formalization of what active editors had known from ground-level experience for at least a year.
David Lovett, who runs the Edit History newsletter covering Wikipedia's internal dynamics, was direct: "Wikipedia should do everything it can to stay clean." The framing of "cleanliness" is telling — it positions AI-generated content not as a quality spectrum but as a contaminant.
Chaotic Enby, the policy's proposer, described the vote as resistance to "enshittification" — a term originally coined by writer Cory Doctorow to describe the process by which platforms degrade their user experience under commercial pressure. Its invocation here is deliberate: the concern is not just content quality but the structural incentives that AI companies bring to platforms they depend on for training data.
The two dissenting votes remain unexplained in public record — a notable absence, given that RfC discussions typically include substantial written argument from participants on both sides.
Impact on AI Training — Wikipedia as Foundational Dataset
Wikipedia's role as foundational training data for AI models makes this policy decision consequential beyond Wikipedia itself.
Every major commercial LLM — GPT-4 and its successors, Claude, Gemini, LLaMA variants — was trained on Wikipedia data. The encyclopedia's coverage of millions of topics in dozens of languages, with consistent citation standards and active editorial oversight, makes it irreplaceable as training material. There is no equivalent substitute at the same scale and quality.
If AI-generated content had continued to enter Wikipedia at the rate suggested by the WikiProject AI Cleanup workload, the contamination of this foundational dataset would have been gradual, distributed, and difficult to detect at the corpus level. A training run ingesting Wikipedia in 2027 would have been measurably worse than a training run in 2023 — but the degradation would have been hard to attribute to any specific cause.
The ban preserves Wikipedia's value as clean training data — which is, somewhat ironically, one of the most direct benefits AI companies receive from Wikipedia's decision to exclude AI-generated content. The community that banned AI is protecting the dataset that AI depends on.
This dynamic is not lost on the Wikipedia editor community. The policy does not prohibit AI companies from using Wikipedia as training data — it prohibits AI from contributing to Wikipedia. The asymmetry is intentional.
For those interested in how agentic AI systems are increasingly reshaping software development workflows in parallel to these content concerns, see Anthropic's agentic coding trends report on multi-agent systems.
Wikipedia is the largest platform to implement this kind of explicit prohibition, but it is not the first.
Stack Overflow banned AI-generated coding answers shortly after ChatGPT's launch, after the platform was flooded with responses that appeared authoritative but contained subtly flawed logic — code that compiled but produced incorrect results, explanations that sounded correct but described behavior that did not match. The Stack Overflow community, like Wikipedia's, depends on accuracy rather than mere plausibility. The distinction matters enormously when someone uses an answer to debug production code or update an encyclopedia entry.
Reddit has maintained more nuanced policies across its individual subreddit communities, with many high-traffic technical subreddits implementing their own AI content bans at the moderator level. The fragmented approach reflects Reddit's decentralized structure — but the pattern across communities is consistent.
Spanish Wikipedia represents the strictest implementation to date among Wikipedia's language editions — a total prohibition with zero exceptions. No copyediting assistance, no translation aid. The English edition's two exceptions make it marginally more permissive, but the structural commitment is identical: LLMs do not write Wikipedia articles.
The convergence of these independent decisions across platforms with very different structures, audiences, and governance models suggests something important: the communities closest to the problem — the people actually reviewing, editing, and maintaining the content — are reaching similar conclusions through independent processes.
Gadget Review noted the potential domino effect — the Wikipedia decision could "empower other online communities to establish AI content restrictions" by demonstrating that large-scale, high-traffic knowledge platforms can enforce meaningful AI content policies.
The Quality Argument — Human Editors vs. AI Accuracy
Defenders of LLM use in editorial contexts often cite speed and accessibility — AI can produce draft content that human editors then improve. The Wikipedia community's experience does not support this framing.
The specific failure modes documented by WikiProject AI Cleanup tell the actual story:
Phantom citations. AI-generated articles frequently include references to sources that do not exist — newspaper articles never written, academic papers never published, books with plausible titles attributed to real authors who wrote no such work. These fabrications are formatted correctly and placed appropriately within the text. A casual reviewer — or another AI model being trained on the content — cannot distinguish them from legitimate citations without manually verifying each one.
Mass stub articles. LLMs produce vast quantities of short, thin articles that technically cover a topic without providing verifiable information. These stub articles occupy namespace, pass initial review, and require individual human effort to identify and correct. At volume, they overwhelm the volunteer workforce.
Meaning drift in rewrites. As the policy itself states, LLMs "can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited." When an editor asks an LLM to improve the clarity of a paragraph, the LLM may alter the claims made — substituting accurate statements with plausible-sounding but inaccurate ones, changing hedged language to assertions, or replacing specific data with general approximations.
The quality argument is not that human editors are infallible — Wikipedia's editorial community actively debates and corrects human errors constantly. The argument is that human errors are correctable through Wikipedia's existing social processes, while AI-generated errors arrive at machine speed, mimic legitimate content, and contaminate training data for future systems.
Conclusion — What This Signals for the Open Web
The 44-2 vote is not primarily about Wikipedia. It is a signal from the largest volunteer-maintained knowledge infrastructure on the internet — built over 25 years through millions of hours of human expertise — that the value of that infrastructure depends on keeping AI-generated content out of it.
Wikipedia's decision crystallizes a tension that will define the next phase of the internet's relationship with AI: the same models that depend on high-quality human-generated content to function are, when left ungoverned, the primary threat to the continued existence of that content.
The data poisoning feedback loop is real and documented. The phantom citations are real and documented. The editor burnout from AI-cleanup workloads is real and documented. The 44-2 vote is the formal policy response to those documented realities.
What the decision does not resolve is enforcement. AI detection tools remain unreliable. Administrators must rely on behavioral patterns and output quality rather than automated detection. A determined bad actor can still submit AI-generated content and remain difficult to identify. The policy sets a norm — it does not solve the technical detection problem.
But norm-setting matters. It changes the social contract around what Wikipedia is and who is responsible for maintaining it. It clarifies that Wikipedia's value is inseparable from the human editorial labor that created it. And it establishes a precedent that other major knowledge platforms — encyclopedias, academic repositories, technical documentation systems — can now reference when making their own policy decisions.
The open web's most valuable knowledge infrastructure just voted nearly unanimously to stay human. That vote will echo.
Sources: TechCrunch — Engadget — 404 Media — Implicator.ai — Gadget Review — MediaNama