How Montréal Built a 24/7 Virtual Agent with 95% Accuracy Using Microsoft Copilot Studio

Montréal has turned a classic municipal pain point—citizens struggling to find timely information about services, schedules, and rules—into a 24/7 conversational surface. The city deployed a virtual agent built with Microsoft Copilot Studio that directly answers natural-language questions, integrates with live backend systems, and reportedly achieves 95% accuracy in internal testing. In its first four months, the agent racked up high satisfaction scores while handling over 85% of conversations with generative responses grounded in the city’s own website content.

This isn’t a flashy demo; it’s a production-level public service bot already serving Canada’s second-largest city, a dense bilingual metropolis of more than 1.7 million residents. The agent tackles everything from waste collection schedules and library hours to tax payments and permit procedures, reducing the load on human call centers and delivering faster, more precise answers than the previous system of links and separate web apps.

What Montréal built—and how it works

Montréal’s IT department embedded the assistant directly on its public website. Residents can type questions in plain French or English and receive answers drawn from more than 40,000 indexed pages of official city content. The agent doesn’t just return a list of links; it summarizes the relevant information with citations that let users verify the source on the original site. This shifts the citizen experience from hunting through menus or dialing 311 for routine inquiries to getting an immediate, conversational reply.

But the real power lies beneath the surface. The city connected the agent to two critical backend systems via existing APIs:

Waste collection: When a user asks about pickup schedules, the agent asks for an address and then returns a personalized calendar—no need to open a separate app and search again.
City facilities: Queries about libraries, pools, or community centers can include a location and time preference. The assistant returns the exact opening hours and address without forcing the user to scroll through a full schedule page.

This direct connectivity represents a leap from the old model, where a query would only serve a link to an external web app, forcing the citizen to run a second search. The agent now delivers the answer inline, instantly.

Architecture and the hybrid response model

The assistant was designed entirely in-house using Microsoft Copilot Studio, a low-code platform that allowed city developers to build and iterate without outside consultants or new custom APIs. Mohamed Arhab, Solution Architect in the city’s IT department, emphasized the value of reusing existing system connectors: “With Copilot Studio, we didn’t have to develop any new custom APIs. This saved significant development time and resources.”

The architecture hinges on a hybrid response strategy that mixes three types of handling:

Curated, deterministic responses for frequently asked or sensitive topics (e.g., tax deadlines, emergency procedures).
Generative answers grounded in website content for open-ended informational queries.
API-driven responses when up-to-date, user-specific data is required (e.g., waste schedules based on address).

This hybrid pattern is a core capability of Copilot Studio and was a decisive factor in Montréal’s decision. “One key reason we chose Copilot Studio was the option to easily combine classic, pre-built responses of a chatbot with the AI-generated responses of an agent,” Arhab explained. “This hybrid option enabled us to achieve a higher level of accuracy than just using generative AI alone.”

To further boost accuracy, the team added custom entities—like Canadian postal codes and a list of Montréal’s boroughs—to improve intent recognition and reduce misunderstandings. On the backend, the city initially used a customized Power BI dashboard for telemetry and plans to transition to Copilot Studio’s own analytics capabilities as they mature. A full security and governance review by the city’s cybersecurity group preceded the production rollout, a step that municipal IT leaders should consider mandatory.

Impressive early metrics—with a grounded view

Montréal reports that internal testing delivered a 95% accuracy rate after customization. Over 85% of live conversations are handled by generative responses, with the remainder falling to deterministic topics or API calls. Customer satisfaction scores in the first four months were described as “extremely positive.” These numbers suggest a well-tuned system that could measurably ease pressure on human operators, especially during seasonal spikes in calls.

However, these figures come with an important caveat: they are based on the city’s own internal testing and early production data, not independent third-party validation. Municipal deployments, with their diverse user populations and unpredictable phrasing, can exhibit different accuracy characteristics over time, particularly for edge cases. The 95% figure is an encouraging signal, not a guarantee. Continuous monitoring, refinement, and real‑world validation are essential to sustain performance.

Why Copilot Studio?

Montréal’s team points to three principal reasons for their platform choice:

Rapid development: Low-code authoring let the city build the agent internally, avoiding the cost and delay of external vendors.
Hybrid flexibility: Out-of-the-box support for grounding generative answers on web content while blending curated chatbot flows promised higher accuracy and predictability.
Easy API connectivity: Existing API tooling in the Power Platform meant no new custom integrations were needed for the waste and facilities systems.

Microsoft also provides a “Citizen Services” template designed specifically for public-sector scenarios. The template documents both capabilities and limitations, including explicit warnings that AI-generated content can still contain mistakes and must be governed. This transparency is critical: city IT teams adopting such tools cannot simply flip a switch and trust the output.

Strengths that other municipalities can replicate

For other local governments eyeing similar modernizations, Montréal’s deployment surfaces several repeatable strengths:

Speed to value: With low-code authoring and prebuilt templates, a municipal team can prototype and publish a useful agent in weeks, not months.
Hybrid accuracy: Combining deterministic answers for high-stakes topics with generative AI for broader queries reduces hallucination risk and boosts user trust.
Actionable answers: Direct API integrations turn the agent from a search overlay into a transactional assistant, directly improving citizen experience.
Multichannel potential: Agents authored in Copilot Studio can later be extended to Microsoft Teams, internal portals, or other channels without a full rebuild.
Governance and telemetry: Built-in analytics (or adjacent Power BI dashboards) combined with Power Platform governance controls allow cities to set quotas, audit trails, and approval workflows. Montréal’s planned migration to Copilot Studio analytics is a practical example of maturing operational rigor.

The operational risks no city should ignore

Generative AI in public services is not without measurable risks. Montréal’s experience and the accompanying Microsoft documentation highlight several pitfalls that demand careful planning:

Data exposure and permissions
Agents that connect to Dataverse tables or internal data stores can inadvertently expose fields unless access is scoped tightly. Microsoft warns that default table permissions may reveal more columns than intended. Municipalities must audit entity and field-level permissions to enforce least privilege.

Authentication misconfigurations
When using token passthrough on private sites, incorrect Entra ID settings can cause misleading behavior—such as reporting a successful record creation when the underlying operation failed. End-to-end authentication flows must be validated during staged testing before production.

Hallucination and stale content
Even when grounded in website content, generative models can paraphrase incorrectly or omit critical details. The Citizen Services template explicitly states that AI-generated answers can contain mistakes. For tax penalties, emergency notices, or other high-impact information, cities should provide deterministic responses, direct links to authoritative documents, and explicit validation steps.

Privacy and document handling
If agents accept file uploads or use OCR, document extraction becomes a privacy concern—especially if citizens inadvertently upload personally identifiable information. Clear data retention, consent, and DLP policies must be in place, with privacy notices posted on the public site.

Cost, quotas, and scaling
Copilot Studio agents consume messages and are subject to metered usage. Sudden spikes (e.g., a snow emergency or a tax deadline) can exhaust quotas and generate unexpected bills. Administrators should forecast consumption, set alerts, and implement fallback messages or throttling strategies.

A practical checklist for municipal IT teams

Drawing from Montréal’s deployment and Microsoft’s guidance, other cities can use this structured approach:

Define scope and success metrics – Start with high-volume, low-risk services (waste schedules, facility hours). Set CSAT and containment targets, plus thresholds for false positives.
Curate knowledge sources – Identify authoritative website sections and add curated Q&A pairs for high-stakes topics.
Build hybrid flows – Use deterministic topics for legal, billing, and emergency info; generative flows for broad queries; API calls for live data.
Implement identity and least privilege – Validate authentication end-to-end and review Dataverse table/field permissions.
Test rigorously – Run a controlled pilot with representative citizen queries; measure accuracy, hallucination, and edge-case failures.
Set governance and telemetry – Configure analytics (Power BI or Copilot Studio’s built-in) and approval workflows for content and model changes.
Plan for scale and cost control – Set message quotas, alerts, and fallback modes; model seasonal demand surges.
Communicate transparently – Clearly label the agent as automated, provide fallback to human agents (311), and post privacy notices.

These steps reflect what Montréal operationalized: a deliberate balance of generative AI with curated determinism, custom entities for local context, and formal cybersecurity sign‑offs.

What Montréal’s experience suggests about broader adoption

Montréal’s rollout demonstrates that a large, complex city can deliver tangible improvements in citizen experience using modern low‑code tools and a pragmatic hybrid architecture. The combination of content grounding at scale, real‑time API integrations, and internal ownership creates a template for peer agencies that want to move beyond static FAQs.

At the same time, the early metrics—while promising—should be interpreted with practical caution. The 85% generative share and 95% accuracy rate are meaningful operational signals, but they come from a specific early production period and controlled testing. Long‑term performance will depend on user diversity, seasonal surges, model updates, and sustained governance. No municipal AI agent is a “set it and forget it” asset.

Montréal’s announced plans to expand the assistant—adding more services, migrating to Copilot Studio’s native analytics, and building an internal version for community communications agents—follow a sensible incremental model. Start with high‑volume, low‑risk services, learn operational patterns, then broaden scope while strengthening governance and monitoring.

The bottom line: promise balanced with prudence

Montréal’s virtual agent is not a finished product so much as a live experiment in modern service delivery. It already delivers better answers, faster access, and measurable operational relief to human call handlers. Its early success offers a blueprint for other municipal IT teams, but it also underscores the operational responsibilities that come with placing generative AI at the front door of public services. Governance, privacy, permission boundaries, and cost management must lead—not follow—innovation. For cities ready to pair low‑code speed with disciplined oversight, Montréal’s experience demonstrates that fast wins and durable public value can go hand in hand.