Microsoft's Copilot Actions Almost Booked Me Dinner — But the Web's Defenses Stopped It Cold

Microsoft’s Copilot Actions nearly booked a dinner reservation for me—until it hit a phone verification wall. That’s the state of AI agents in mid-2024: impressively capable but still stymied by the web’s anti-bot defenses. In a hands-on test, Copilot navigated OpenTable, found a Japanese restaurant, selected a time for two people, and filled in the details—only to pause and ask for a phone number and SMS code. A second test, buying a book from Barnes & Noble, followed the same arc: smooth sailing until payment credentials were required.

These experiences come from a detailed review by PCMag, which put Copilot Actions through real-world tasks. The feature, part of Microsoft’s expanding AI toolkit, promises to let users offload web chores to an AI agent that can click, type, and navigate sites on their behalf. It’s the consumer-facing side of the agent hype that dominated Microsoft Build 2024. And while it delivers genuine automation, the cracks show just how far we are from a truly hands-off assistant.

What Are Copilot Actions?

Copilot Actions is Microsoft’s implementation of an AI agent—software that can perform multi-step tasks on the web instead of just answering questions. Microsoft positions it as a way to “book event tickets, grab dinner reservations or send a thoughtful gift,” with launch partners like OpenTable, Booking.com, Expedia, and others. But underneath, it works on any public website that isn’t explicitly blocked for harmful content.

The feature lives inside the Copilot web interface (copilot.microsoft.com). Once signed in with a Microsoft account, you select the “Action” option from the prompt box. Free accounts get a limited number of sessions; Copilot Pro subscribers get more. In testing, PCMag was cut off after four free sessions, though Microsoft hasn’t published official quotas.

The Cloud Browser Under the Hood

Unlike local automation tools, Copilot Actions doesn’t run on your PC. Microsoft provisions a disposable virtual machine in the cloud, launches a browser there, and drives it programmatically. The AI “sees” the page by capturing screenshots and analyzing them to locate buttons, text fields, and other interactive elements. It then executes clicks and keypresses in that remote browser, with a split-screen view showing the remote browser pane and a Copilot sidebar.

This architecture has immediate implications:
- Low local resource usage: The heavy lifting happens in Microsoft’s data centers, so your CPU and GPU stay cool. PCMag noted no significant spike in local resource usage.
- Sandboxed isolation: Because the agent runs in a cloud VM, it can’t directly read your local files, cookies, or device state—a deliberate privacy safeguard. But it also means the agent doesn’t know your saved passwords or location unless you provide them.

Hands-On: Dinner, Books, and the Unyielding Web

The PCMag review tested two classic e-commerce scenarios.

Dinner Reservation on OpenTable

The prompt: “Make a dinner reservation for two for 8 p.m. at a good Japanese restaurant nearby using OpenTable.” Copilot launched its cloud browser, searched OpenTable (via Bing, since the prompt specified the site), navigated the reservation widget, and successfully selected the details. Then it paused. The site demanded a phone number and an SMS verification code. Copilot can’t receive text messages, so the user had to enter both manually. The reservation was completed, but only after human intervention.

A curious quirk: the agent thought it was in Chicago, even though the user wasn’t. It’s likely the cloud VM’s geolocation overrode the user’s actual location, a disconnect Microsoft will need to fix.

Book Purchase on Barnes & Noble

Here, Copilot asked for a genre clarification (“literary”), then navigated search results, selected Chris Whitaker’s All the Colors of the Dark, and reached the checkout page. Again, it stopped when payment details and account login were needed. The user had to step in to complete the purchase.

These tests highlight a pattern: Copilot Actions excels at discovery and form-filling across diverse sites, but real-world obstacles—CAPTCHAs, SMS multi-factor authentication (MFA), sign-ins—preserve the status quo where a human must intervene.

Where Copilot Actions Shines

Despite the friction, the feature demonstrates tangible progress:

Genuine web automation: Copilot handled heterogeneous page layouts via screenshots and DOM analysis, a non-trivial technical achievement.
Low resource footprint: Because everything runs in the cloud, even low-powered laptops and tablets can use Actions without slowdown.
Unified conversational flow: You can search, clarify intent, and trigger an action all within a single chat interface—a glimpse of a future where browsing is more about intent than clicking.

In community discussions on forums like Windows Forum, early adopters noted that Actions effectively reduces the tedium of complex searches and initial form-filling, even if it can’t finish the last mile.

The Friction Points That Keep It Earthbound

CAPTCHAs and MFA

Websites use CAPTCHAs and SMS/authenticator codes precisely to thwart bots. Copilot respects these guards by design, so any task requiring such verification will stop for human input. That’s a security feature, not a bug—but it limits how autonomous the agent can be.

Speed and Latency

In PCMag’s test, booking a reservation manually was faster than letting Copilot do it. The VM spin-up, remote page rendering, screenshot analysis, and deliberate pacing (to avoid mistakes) add overhead. Commuters in the forum thread confirmed this: “It feels slower than doing it yourself.” The advantage only materializes if the task runs asynchronously or if you’re delegating multiple steps you’d rather not think about.

Location Blindness

As seen with the Chicago mix-up, Copilot Actions doesn’t reliably inherit your device’s location. That’s a consequence of running in a cloud VM rather than locally. A hybrid approach—using local cookies or a consent-based location share—could solve this, but it also complicates the privacy model.

Session Limits and Paywalls

Free accounts get a handful of sessions before hitting an undisclosed rate limit. Copilot Pro ($20/month) lifts that cap, but the exact numbers remain opaque. Users planning to rely on Actions for daily chores may need a subscription, and even then, the tool isn’t yet a time-saver.

Privacy: The Screenshot Question

Copilot Actions sandboxes its browser in the cloud, so local data stays safe by default. But the agent captures screenshots of the pages it visits to analyze the layout and decide where to click. Those screenshots are processed in Microsoft’s cloud, and the forum analysis rightly flags this as a privacy concern. Microsoft states that sessions are ephemeral, but it has not published a clear data-retention policy for those screenshots—whether they are stored, used for training, or deleted immediately.

For now, sensitive information (credit card numbers, passwords) shouldn’t be typed into remote browser windows until Microsoft provides transparent telemetry controls. The PCMag piece notes that privacy “isn’t much of an issue (yet)” precisely because the agent can’t log into accounts or store credentials—it always asks you to type them. If Microsoft later adds a credential vault to enable full automation, the privacy calculus changes dramatically. Forum posters urge Microsoft to offer an optional secure vault with customer-controllable retention and auditing.

Business Model and Regional Restrictions

Copilot Actions is part of Microsoft’s broader AI monetization play. The free tier is essentially a trial, nudging users toward Copilot Pro for more sessions and capabilities. Enterprise customers get Microsoft 365 Copilot with governance tools, but the consumer version remains deliberately limited.

Notably, Actions is unavailable in the European Union due to stricter privacy regulations. That regional lockout reflects the tension between automated data processing and GDPR compliance. Users in other regulated markets should check availability before depending on the feature.

What the Community Says: A Practical Checklist

Based on PCMag’s tests and forum discussions, a practical set of guidelines emerges for anyone trying Copilot Actions today:

Use it for low-risk, repetitive tasks: Searching for products, pre-filling forms, or exploring options across multiple sites.
Avoid financial transactions that require saving credentials until Microsoft delivers an auditable, secure credential vault.
Expect to intervene for OTPs, CAPTCHAs, sign-ins, and payment entry.
Confirm regional availability—if you’re in the EU or a similarly regulated region, Actions won’t appear.
Monitor Microsoft’s retention policies as they evolve; don’t enter sensitive info into the remote browser without clarity.

These recommendations align with the “treat it as a smart assistant, not a life manager” ethos shared across early testers.

The Road Ahead: What Microsoft Needs to Do

For Copilot Actions to move from intriguing demo to dependable tool, Microsoft must address several critical gaps:

Publish a data-retention whitepaper: Users need to know exactly how long screenshots and session logs are kept, who can access them, and whether they’re used for AI training.
Offer a secure credential vault: An opt-in system with transparent auditing, customer-controlled retention, and potentially local-only storage for the most sensitive entries.
Reduce latency and add asynchronous modes: Faster VM warm-up, incremental rendering, and a background mode that notifies you when a task completes would make Actions more practical.
Broker industry standards for agent access: Work with major sites to create authenticated agent APIs—verifiable, auditable pathways that let trusted AI assistants bypass anti-bot measures without compromising security.
Fix location handling: Allow users to optionally share their actual location with the cloud session, or let the agent use locally stored preferences.

A 6–18-Month Outlook

The agent model isn’t exclusive to Microsoft. Google’s Project Mariner, Perplexity’s Comet browser, and other startups are racing to similar goals. Over the next year and a half, expect incremental improvements: better credential handling for enterprise accounts, performance boosts, and possibly a limited set of partner sites that allow deeper agent integration. But a fully unattended agent that can book flights, pay bills, and manage a calendar while you sleep? That requires not just technical fixes but trust—and trust takes time.

For now, Copilot Actions is a credible first step. It automates the drudgery of web navigation better than any previous consumer tool, and its cloud-based approach avoids the local resource drain that plagues on-device AI. Yet the web’s immune system—CAPTCHAs, MFA, location checks—remains formidable, and Microsoft’s own privacy disclosures need to catch up with the technology.

Conclusion: Not Your Digital Butler Yet

Copilot Actions works. It finds restaurants, fills forms, and marches through checkout processes with a mix of AI sight and scripted logic. But it can’t finish the job in most real-world scenarios without a human stepping in. That’s partly by design: security measures that protect users from automated attacks also shield them from well-intentioned bots.

The feature is best understood as an early preview of agentic browsing—a concept that will likely become mainstream within a few years. For Windows enthusiasts and curious tinkerers, it’s worth trying with low-stakes tasks to get a feel for the future. But don’t hand over your passwords or credit card just yet. As one forum poster put it, “Copilot Actions is an impressive milestone, but it’s more of a guided demo than a life manager.” Until Microsoft solves the privacy, speed, and authentication puzzles, that assessment will hold.