Microsoft just turned the agent hype into a production-ready platform. At Build 2026, staged at the Moscone Center in San Francisco, the company announced a radical expansion of Microsoft Foundry, its enterprise AI orchestration suite. The update introduces hosted agent runtimes, reusable toolboxes, managed memory, Foundry IQ grounding, new MAI models, and baked-in observability and governance. The message was clear: autonomous AI agents are no longer a prototype. They're infrastructure.
Satya Nadella, CEO of Microsoft, framed the release as the logical conclusion of a two-year push to make copilots inherently autonomous. \"With Foundry, we're giving every enterprise a factory for AI agents,\" he said during the keynote. \"They run, scale, and govern themselves. You just define the outcomes.\"
The announcements directly address the hardest problems that have kept AI agents confined to hackathons and demos: running them reliably at scale, integrating them with real business tools, keeping them grounded in truth, and auditing every action they take.
Hosted agent runtimes bring always-on, auto-scaling agents
The most significant piece of new infrastructure is the hosted agent runtime. Previously, developers had to build and maintain their own execution environments for AI agents—managing containers, scaling APIs, handling retries, and dealing with the frequent incompatibilities between agent frameworks and cloud plumbing. Foundry now absorbs that complexity entirely.
A hosted agent runtime is a fully managed, serverless compute environment purpose-built for long-running, tool-calling, multi-step AI workflows. Developers author an agent definition in natural language, specify tools, and set goals. Foundry compiles that into an auto-scaling service that persists state, queues tasks, and recovers from failures transparently. Microsoft says a single runtime can host hundreds of simultaneous agent sessions, each maintaining a distinct memory context.
The runtime supports multi-agent orchestration natively. Teams of agents can spawn sub-agents, hand off tasks, and negotiate outcomes—all within a secured, isolated sandbox. Billing follows a consumption model based on inference time and tool invocations, making cost predictable. For enterprises already running thousands of service hours on cognitive tasks, this solves the \"operationalization gap\" that killed most pilot projects.
Reusable toolboxes turn any API into a one-click agent action
No agent works without tools, and wiring tools into agents has been a manual, brittle process. Foundry's reusable toolboxes change that. Think of them as pre-validated, security-hardened, plug-and-play connectors for virtually any enterprise system.
Covering over 200 SaaS platforms, databases, and internal APIs, the toolboxes are built on a common abstraction layer that converts API contracts into agent-friendly function signatures. An agent can be instructed to \"check inventory in SAP, create a ticket in ServiceNow, and email the customer\"—and Foundry resolves the correct endpoints, authentication, parameter mapping, and error handling automatically.
Crucially, toolboxes are reusable across agents. A team that validates a Salesforce connector once can publish it to an organizational toolbox catalog. DLP policies, role-based access controls, and rate-limit thresholds attach at the toolbox level, so governance is enforced before agents ever touch a production system. Microsoft also released a toolbox SDK that lets partners and internal teams create custom toolboxes without writing scaffolding code.
Managed memory gives agents persistent context without vector databases
State management has been the Achilles' heel of LLM-based agents. Each turn resets, forcing developers to stuff entire conversation histories into context windows or wrangle vector databases just to recall that a meeting was postponed. Foundry's managed memory subsystem solves this with a graph-native, automated memory layer that tracks entities, relationships, and temporal events across agent sessions.
When an agent interacts with a user or another system, Foundry extracts structured facts—dates, decisions, open items, dependencies—and records them in a managed knowledge graph. Later interactions automatically include relevant remembered facts, without the developer embedding them in prompts. The memory engine uses retrieval-augmented graph traversal, so relevance degrades gracefully as context grows. It never hallucinates a memory; if a fact is uncertain, the agent asks for clarification.
Memory is partitioned per workspace, tenant, and compliance boundary. Customers can set retention policies, control which memory layers agents can access, and audit exactly what an agent \"remembered\" during a transaction. This turns stateful agents from a science project into something an auditor can sign off on.
Foundry IQ grounding anchors agents in verified enterprise truth
Hallucination is the ultimate dealbreaker for enterprise AI. Foundry IQ grounding is Microsoft's most ambitious answer yet: a multi-layered grounding service that provides agents with a live, curated funnel of enterprise facts, pulling from structured data, unstructured documents, real-time APIs, and authoritative knowledge bases.
It combines several techniques under one API. A retrieval-augmented generation (RAG) pipeline indexes a company's SharePoint, databases, and custom stores with fine-grained permissions. On top of that, a semantic knowledge graph models the conceptual relationships between data assets, so agents can reason about context, not just keyword match. Finally, a fact-checking module uses models trained on corporate ground-truth data to score every generated statement against verified sources before it's sent to a user or another system.
The grounding service integrates with the managed memory layer, so factual memory and operational memory reinforce each other. A customer service agent won't just retrieve a policy document; it will know that the policy changed last Tuesday because the memory graph recorded the update event from the compliance team's email thread.
New MAI models suggest a deeper hardware-software co-design play
Microsoft quietly loaded a new model family into Foundry's model catalog: MAI-1, MAI-2, and MAI-Lite. While the company didn't share architectural details, it said the models are optimized for agentic workloads, especially tool-use, multi-step planning, and function calling. Given Microsoft's recent investment in custom silicon and partnerships with AMD and NVIDIA, the MAI designation—likely short for Microsoft AI—points to a vertically integrated stack.
Early benchmarks shown on stage put MAI-2 ahead of GPT-4o on several agent-specific evals, including SWE-bench verified and a new agentic win-rate metric Microsoft developed. The model excels at parallel tool use, managing ten or more simultaneous API calls without drifting off-prompt. Microsoft plans to open-source some components of the MAI training framework under a permissive license, a move that could attract more agent builders to the Foundry ecosystem.
Observability and governance baked in, not bolted on
The final leg of the announcement was a unified observability and governance console. Every agent action—tool invocation, memory write, grounding call, sub-agent handoff—produces structured traces that flow into a centralized dashboard. Built on the same telemetry backbone as Azure Monitor, the console gives SREs and AI risk managers real-time visibility into agent throughput, token consumption, and anomalous behavior.
Policy authoring uses a declarative language that lets compliance officers define guardrails: \"agents cannot send email to external domains without human approval,\" or \"if an agent requests access to financial data, require MFA.\" These policies are enforced at the runtime level, so they can't be bypassed by prompt injection. Audit logs with chain-of-custody provides a cryptographic record of exactly what an agent did, why it did it, and which human approved any step that required it.
What this means for Windows and enterprise developers
Though Foundry runs on Azure, the implications ripple directly to the Windows ecosystem. A new preview of the Foundry Agent SDK for .NET lets Windows developers embed agent runtimes directly into WinUI and WPF applications, hybridizing local model execution with cloud offload. That means a Windows desktop app can host a locally running MAI-Lite agent for latency-sensitive tasks while seamlessly escalating complex queries to cloud-hosted MAI-2.
Microsoft also demonstrated agents built with the Foundry toolchain running inside Microsoft 365 apps—agents that take complex email threads, parse attached spreadsheets, cross-reference CRM data, and draft legally reviewed responses—all without leaving the Outlook canvas. For the millions of Windows-centric enterprises already entrenched in the Microsoft stack, this is the closest they'll get to a turnkey agent ecosystem.
The hardware angle is equally noteworthy. Microsoft signaled that future Windows AI PC specifications \u2014 likely branded as Copilot+ PCs \u2014 will include dedicated neural processing requirements optimized for local MAI-Lite inference. This brings Foundry's agent runtime directly to the edge, enabling offline-capable, privacy-preserving agents that can later sync with cloud-based memory and grounding services.
Competitive landscape: Microsoft builds the rails, others race
The announcements put Microsoft in a unique competitive position. Amazon Bedrock and Google Vertex AI offer agent-building frameworks, but neither has the end-to-end, hosted, governed execution environment that Foundry now provides. Salesforce recently launched Agentforce, but it remains tightly coupled to the Salesforce platform. OpenAI's own agent APIs are still in early access. Microsoft, by contrast, now ships a complete agent factory that spans the full lifecycle from tool authoring to deprecation.
Analysts on the ground in San Francisco noted that the governance layer could become the moat. \"Enterprises won't put agents into production without bulletproof audit and policy enforcement,\" said one Gartner analyst during a hallway briefing. \"If Microsoft gets that right, CIOs won't look anywhere else.\"
The agent era's infrastructure moment
Build 2026 will be remembered as the moment AI agents graduated from demo-ware to durable infrastructure. By absorbing the undifferentiated heavy lifting of runtime hosting, memory management, grounding, and governance, Foundry frees developers to think about what agents should do, not how to keep them running.
The road remains uncertain. Real-world agent reliability at scale is unproven, cost models for persistent agent sessions could surprise early adopters, and the governance tooling must withstand the creativity of adversarial prompts. But Microsoft's wager is clear: the next billion-dollar enterprise software cycle will be built not with code, but with composed agents. And Foundry just became the most credible factory floor for that work.