A group of 34 newspaper publishers, spearheaded by WEHCO Newspapers Inc., filed a federal lawsuit in New York on June 24, 2026, accusing OpenAI and Microsoft of illegally using copyrighted news articles to train the Copilot artificial intelligence system. The complaint, lodged in the Southern District of New York, marks the largest joint action by local news organizations against the tech giants to date, and it ratchets up the legal pressure over how AI models are built from publicly available content.
WEHCO, which owns the Chattanooga Times Free Press and the Arkansas Democrat-Gazette, says the companies stripped articles of bylines, publication dates, and copyright notices—a direct violation of the Digital Millennium Copyright Act (DMCA)—and then reproduced substantial portions of those works through Copilot without permission or compensation. The suit seeks statutory damages, which could run into the millions of dollars, and a court order blocking further use of the protected material.
The Allegations at the Heart of the Case
The 68-page complaint details what the publishers call a “systematic, industrial-scale theft” of journalistic content. According to the filing, OpenAI’s GPT-series models and Microsoft’s Copilot assistant were trained on vast datasets that included articles scraped from the web, many of which were originally published by the plaintiffs. The publishers argue that by ingesting entire news stories and later spitting out paraphrased or verbatim answers, Copilot essentially acts as a replacement for the original sources, undercutting subscription and advertising revenue.
Central to the lawsuit is a DMCA claim for removal of copyright management information (CMI). The newspapers assert that when their articles were fed into training pipelines, any metadata identifying the author, title, or copyright holder was deliberately excised—a practice the DMCA forbids. The plaintiffs also bring direct copyright infringement claims, arguing that even intermediate copying during training constitutes an unauthorized reproduction.
“These companies didn’t just borrow our work,” a WEHCO representative said in a statement accompanying the filing. “They erased our identity and used our reporting to build a product that competes directly with us, all without a dime in compensation.” The suit highlights specific instances in which Copilot allegedly generated news summaries nearly identical to the original articles, with no hyperlink or attribution, effectively making the underlying journalism invisible to the user.
A Rapidly Growing Coalition
What began as a handful of individual lawsuits has mushroomed into a coordinated offensive. The 34 plaintiffs collectively represent small to mid-sized daily and weekly newspapers across more than a dozen states. Alongside WEHCO’s flagship titles, the coalition includes the Denton Record-Chronicle, The Joplin Globe, the Omaha World-Herald, and several other publications owned by various regional chains. Many of these outlets have already seen print and digital advertising revenues dwindle over the past decade, and they view unauthorized AI training as an existential threat.
Legal observers note the strategic choice to sue in New York, where the New York Times’ similar case against OpenAI and Microsoft is still working its way through the courts. Consolidating the smaller publishers’ claims in the same district allows them to piggyback on judicial reasoning that emerges from the higher-profile action.
Microsoft’s Copilot: A Windows Integration Complicates Matters
Unlike standalone AI chatbots, Microsoft’s Copilot is deeply woven into the Windows 11 operating system, the Edge browser, and the company’s Office suite. This integration makes the assistant a near-constant presence for hundreds of millions of users. When Copilot summarizes a news event, it often pulls from multiple sources, but the lawsuit argues that it presents the information as if it were original, effectively cannibalizing traffic that would otherwise flow to the publishers’ websites.
Microsoft has previously defended its AI training practices by invoking the fair use doctrine, arguing that machine learning on publicly accessible data is transformative and does not harm the market for the original works. The company also points to its deployment of the “robots.txt” protocol and opt-out mechanisms, though the publishers counter that such measures are ineffective for content already ingested.
OpenAI’s Stance and Industry-Wide Reckoning
OpenAI has faced similar accusations from major news organizations, including the New York Times, Reuters, and a class action brought by authors. In those cases, the company has maintained that training on internet data is a lawful, society-benefiting use that falls under fair use. OpenAI’s chief legal officer previously stated that “the AI models learn patterns, not facts, and the occasional regurgitation of training data is a bug we’re actively working to fix.”
The WEHCO suit, however, argues that the regurgitation is not a rare glitch but a fundamental feature of how the models operate when prompted to recall news. The complaint includes exhibits showing side-by-side comparisons of original articles and Copilot outputs, with alleged examples of verbatim passages exceeding 200 words.
The DMCA Wrinkle
The DMCA’s provisions against CMI removal have rarely been tested in the context of AI training. If the court finds that stripping metadata like bylines and copyright notices is a violation, the damages could be substantial—up to $25,000 per work infringed. For a corpus containing hundreds of thousands of articles, the cumulative exposure could run into the billions, providing an enormous incentive for a settlement or for Microsoft and OpenAI to adopt stricter licensing regimes.
Technology analyst Raj Patel said, “The DMCA claim is a creative legal theory that sidesteps the fair use debate entirely. Even if a court later decides training is fair use, the act of intentionally deleting copyright management information could still be illegal. It’s a potent weapon for the publishers.”
Local Journalism on the Brink
The lawsuit paints a grim picture of the state of local news. Many of the plaintiff newspapers have closed bureaus, laid off reporters, and reduced print frequency over the past decade, in part because of the migration of advertising to tech platforms. According to the complaint, the emergence of AI-generated news summaries threatens to accelerate that decline, as users increasingly rely on Copilot to answer “what happened today” without ever visiting a newspaper site.
“This isn’t just about money,” said media economist Dr. Lena Forsythe. “It’s about the public’s access to original, verified reporting. If these outlets can’t pay their journalists, the only thing left will be AI-generated content that has no direct connection to truth or accountability.”
Previous Legal Skirmishes
The current suit doesn’t exist in a vacuum. In late 2023, the New York Times sued OpenAI and Microsoft, alleging the companies had used millions of its articles without permission. That case survived early motions to dismiss and is currently in discovery. The Authors Guild filed a class action representing thousands of writers, and Getty Images has sued Stability AI over image training. Courts have so far been reluctant to issue sweeping rulings, preferring to let cases develop factually.
A key turning point came in February 2026, when a federal judge in California ruled that the scraping of publicly available news articles for AI training was not transformative per se and must be analyzed on a use-by-use basis. That ruling opened the door for more granular challenges like the one WEHCO and its peers are now pursuing.
Microsoft Windows and the Copilot Ecosystem
For Windows enthusiasts, the legal battle has a direct impact on how Copilot operates within the ecosystem. Currently, Windows 11 ships with Copilot pinned to the taskbar, offering instant access to news summaries, productivity assistance, and web search. If the publishers prevail, Microsoft might be forced to reconfigure Copilot to exclude certain sources, limit summarization capabilities, or implement a licensing scheme that could alter the free, ad-supported model users have come to expect.
Inside sources suggest Microsoft is already exploring content licensing agreements with larger publishers, but the fragmented nature of local news makes individual deals impractical. A court-ordered remedy could force the creation of a collective licensing body akin to what exists for music royalties.
What Comes Next
The case is expected to proceed through initial motions, with Microsoft and OpenAI likely to file a motion to dismiss before the end of summer 2026. Legal experts anticipate a vigorous fair-use defense, coupled with arguments that the DMCA’s CMI provisions were not intended to cover machine-learning processes. The court may also consider whether the plaintiffs have standing if the articles in question were not behind a paywall.
In the meantime, more newspaper chains are expected to join the litigation. The Newspaper Association of America has signaled its support, and several state press associations are discussing amicus briefs. A ruling in favor of the publishers could fundamentally alter the AI landscape, forcing tech companies to license training data or rely exclusively on public domain and synthetic content.
For users, the immediate effects are subtle but growing. Some Copilot queries now return responses that say “information unavailable” when asked about certain news events, a possible sign that Microsoft is proactively restricting the assistant’s dataset in response to legal risk. Whether that becomes the norm will depend largely on the outcome of this and similar cases.
As the legal drama unfolds, one thing is clear: the collision between AI technology and intellectual property law is no longer a hypothetical debate. It is a courtroom fight with billions of dollars and the future of journalism hanging in the balance.