Copilot Overconfidence Costs Trust: Preston Gralla Switches to Gemini

Preston Gralla, a Computerworld columnist, has ditched Microsoft Copilot for Google Gemini after a frustrating troubleshooting session revealed Copilot's overconfidence and inaccuracy. The incident highlights growing concerns about AI hallucination and trust in digital assistants, particularly for Windows users relying on Copilot for system support. As AI integration deepens in Windows, Microsoft faces mounting pressure to ensure reliability or risk losing users to more accurate alternatives.

Preston Gralla, a veteran technology journalist and longtime Computerworld columnist, announced in June 2026 that he is abandoning Microsoft Copilot in favor of Google Gemini. The decision came after a maddening experience trying to troubleshoot an iPhone texting problem, where Copilot’s assistance proved both dead wrong and disastrously confident. For Gralla, the incident was the final straw in a long-running frustration with Microsoft’s AI assistant, exposing what he calls a fundamental flaw: Copilot’s tendency to deliver incorrect information with unearned authority.

Gralla’s defection might seem like just one power user’s preference, but it underscores a growing trust crisis around AI tools that millions of Windows users rely on daily. Microsoft has aggressively integrated Copilot into Windows 11, Edge, and the Microsoft 365 suite, positioning it as an indispensable digital companion. Yet, as Gralla’s experience illustrates, overconfidence in a faulty answer can be more damaging than no answer at all—especially when users act on that advice without verification.

The incident itself, described by Gralla in his Computerworld column, involved an iPhone that had stopped sending text messages to Android devices. Turning to Copilot for a quick fix, Gralla received a detailed, step-by-step solution that he later described as “plausible but completely incorrect.” Copilot’s response included specific menu paths and settings to adjust, all presented with the crisp certainty of an expert technician. Gralla followed the instructions, only to find the problem persisted. Further investigation revealed that the recommended settings didn’t exist on his version of iOS, and the underlying diagnosis was flawed.

What rankled most was Copilot’s tone. “It wasn’t just wrong; it was confidently wrong,” Gralla wrote. “There was no hedging, no suggestion that I might double-check, just a direct, authoritative set of instructions that wasted my time and could have caused real problems.” For a journalist who covers technology for a living, the experience was a wake-up call. If an expert can be misled, how many ordinary users are following similarly bad advice without realizing it?

The Overconfidence Problem in AI Assistants

This is not an isolated incident. AI hallucination—where models generate plausible but false information—remains a well-documented limitation of large language models (LLMs) like those powering Copilot and Gemini. However, the problem is exacerbated when the AI is presented as an all-knowing assistant integrated at the operating system level. Microsoft has boasted that Copilot can adjust settings, troubleshoot, and even offer emotional support, raising the stakes for accuracy.

Copilot’s failure in Gralla’s case points to a deeper issue: contextual understanding. Troubleshooting a cross-platform messaging bug requires real-time knowledge of both iOS and Android, which can change rapidly with updates. Copilot, like many LLMs, has a knowledge cutoff and may not fully grasp the current state of software versions. Moreover, it lacks the ability to test solutions or observe results, so it cannot course-correct. Instead, it stitches together patterns from its training data, sometimes producing a Frankenstein’s monster of a solution that sounds right but falls apart in practice.

Google Gemini, by contrast, benefits from Google’s vast real-time data and search integration. Gralla noted that when he posed the same problem to Gemini, it provided a more cautious, context-aware answer that acknowledged the possible version variations and suggested multiple pathways to a fix. That humility, he argued, is precisely what Copilot lacks. “Gemini admitted it might not have the exact answer and offered alternatives. That honesty builds trust,” he said.

Why Windows Users Should Care

For Windows users, the implications are significant. Copilot is not just a web app; it’s woven into the fabric of Windows 11. It can change system settings, manage files, and even interact with other applications. Microsoft’s vision of a “Copilot+ PC” suggests an even deeper integration, where the AI becomes the primary interface for many tasks. If users cannot trust the assistant to give accurate technical support, they may hesitate to let it control critical system functions.

The trust deficit could bleed into other areas. Microsoft is banking on Copilot to drive adoption of its cloud services, from coding in GitHub Copilot to data analysis in Excel. Enterprise customers, in particular, demand reliability. A Salesforce survey found that 73% of IT leaders worry about AI delivering inaccurate or biased results. Gralla’s experience, amplified by his platform at Computerworld, could give those concerns a human face.

Some might argue that users should always verify AI-generated advice, but that’s impractical when the assistant is marketed as a time-saving, authoritative tool. Microsoft itself has promoted Copilot as a way to “get things done faster” and “find the answers you need.” Subtly shifting blame to the user for not fact-checking would undermine that value proposition.

Copilot’s Technical Shortcomings

The company has acknowledged the hallucination problem. In a blog post earlier this year, Microsoft outlined its efforts to improve Copilot’s accuracy through grounding techniques, retrieval-augmented generation (RAG), and fine-tuning on curated knowledge bases. Yet Gralla’s case suggests those measures are insufficient when facing the messy reality of consumer tech support. A query about iPhone-to-Android texting might pull from a jumble of outdated forum posts, Apple documentation, and carrier-specific guides, with no way to prioritize or validate the information.

Google, meanwhile, has its own accuracy challenges. Gemini has faced embarrassing public mistakes, such as recommending glue on pizza in an AI-generated search result earlier in 2026. But Gralla’s switch highlights that, in a head-to-head comparison, Gemini handled uncertainty more gracefully. Perhaps Google’s long experience with search and knowledge graphs gives it an edge in contextualizing ambiguous queries.

Other competitors are not standing still. ChatGPT, powered by OpenAI’s continually updated models, offers web browsing and plug-ins that can access live data. Apple’s forthcoming AI features, rumored to be tightly integrated with Siri and on-device processing, promise privacy-focused assistance without cloud-based hallucinations from web-scraped datasets. In this landscape, Copilot’s misstep could accelerate a diversification of AI assistants among even dedicated Windows users.

Community Reaction and Historical Echoes

Gralla’s column has sparked lively debate on Windows forums and social media. Many users shared similar stories of Copilot’s confidently wrong answers, from incorrect troubleshooting steps for printer issues to invented command-line switches that break PowerShell scripts. Some defended the tool, noting that it works well for common, well-documented tasks like summarization or boilerplate code generation. But the consensus is clear: for technical support requiring current, nuanced knowledge, Copilot is a gamble.

This pattern mirrors early criticism of Microsoft’s Clippy and other animated assistants, which offered unhelpful advice with relentless cheer. History seems to be repeating itself, albeit with a far more capable AI. The lesson from Clippy was that an assistant that overpromises and underdelivers becomes a target of ridicule. With Copilot, the stakes are higher because the potential for harm is greater. Following bad advice to delete a registry key or misconfigure a network setting could brick a system or expose it to security risks.

What Microsoft Must Do Next

First, it must build more robust guardrails that detect when Copilot enters low-confidence territory and clearly communicate uncertainty. If an answer relies on unverified sources or could be outdated, the AI should flag that prominently. This is a user experience challenge: striking a balance between helpful decisiveness and responsible caution.

Second, integration with authoritative support databases could reduce hallucination. Microsoft owns vast repositories of documentation for Windows, Office, and Azure. Tapping those as verified sources—perhaps with a “Microsoft Verified” badge on answers—would give users a baseline of trust. Google’s Gemini already does something similar by linking to sources and showing confidence levels.

Third, a feedback loop that learns from corrections is essential. Gralla could have told Copilot it was wrong, but there was no mechanism for him to easily report the error and have it corrected in real time. Competitors like Perplexity AI emphasize user feedback as a core part of their refinement process.

The Bigger Picture: AI Trust and the Windows Platform

Looking ahead, the AI assistant wars are only beginning. Microsoft’s Copilot enjoys a massive distribution advantage through Windows, but distribution is not destiny. If users lose faith, they will seek alternatives—even on Windows. The Gralla incident may be a portent of a future where users deliberately bypass the built-in assistant in favor of third-party tools they find more reliable. Already, browser extensions and standalone apps for ChatGPT and Gemini are popular among power users.

The consumerization of IT means that user sentiment, not corporate mandate, often drives technology choices. If Windows users start swapping Copilot for Gemini en masse, Microsoft could see its AI ambitions undermined within its own ecosystem. Satya Nadella’s vision of an “AI-first” Windows relies on Copilot being not just present, but trusted.

Preston Gralla’s switch is one data point, but it resonates because it comes from a credible, technically savvy source. When an expert says, “I can’t trust this tool anymore,” the ripple effects can be significant. It doesn’t mean Copilot is doomed, but it should set off alarm bells in Redmond.

As Windows AI features continue to roll out—with rumored updates bringing deeper Copilot integration into File Explorer, Settings, and even the lock screen—Microsoft must prove that its assistant is more than a chatty interface. It must demonstrate that trust is the product, and overconfidence is a bug that must be squashed with extreme prejudice.

For now, Gralla’s recommendation to his readers is clear: “Gemini gave me a humble, useful answer. Copilot gave me a confident, useless one. I know which I’ll choose.” Windows enthusiasts watching this space would do well to test the assistants side-by-side before ceding control to either.

Windows Versions

Microsoft Services

Copilot Overconfidence Costs Trust: Preston Gralla Switches to Gemini

Table of Contents

The Overconfidence Problem in AI Assistants

Why Windows Users Should Care

Copilot’s Technical Shortcomings

Community Reaction and Historical Echoes

What Microsoft Must Do Next

The Bigger Picture: AI Trust and the Windows Platform

Windows Versions

Microsoft Services

Table of Contents

The Overconfidence Problem in AI Assistants

Why Windows Users Should Care

Copilot’s Technical Shortcomings

Community Reaction and Historical Echoes

What Microsoft Must Do Next

The Bigger Picture: AI Trust and the Windows Platform

Share this article

Related Articles

Brad Smith Says AI Must Keep Human Agency (Not Overnight Automation)

Basel-Stadt Schools Go AI With 11,000 Surface Pro 11 Devices, Windows on Arm & Intune

Ookla Warns AI Reliability Now a Business-Critical Risk After 3.72M Outage Reports in 16 Months

Tyger Cloud MRI: Azure Reconstruction Moves Power From Magnet to Compute

June 2026 Patch Tuesday: 208 Vulns, Defender Zero-Day, Windows 11 Feature Updates

June 2026 Patch Tuesday: 200+ Fixes, 3 Zero-Days, and AI-Speed Risk for Windows