Microsoft and Nvidia today at Build 2026 unveiled a sweeping partnership that promises to unify agentic AI development and deployment across the entire computing spectrum—from Windows AI PCs and deskside Nvidia DGX systems to the massive scale of Azure cloud infrastructure. The new stack, spanning Microsoft Foundry and Nvidia’s modular agent framework, aims to let developers build autonomous AI agents once and deploy them anywhere, with consistent tooling and performance.
CEO Satya Nadella took the stage to demonstrate an AI agent that began its lifecycle on a Copilot+ PC, leveraged local NPU acceleration and a connected DGX Station for heavy reasoning, then scaled out to Azure for enterprise-wide rollout, all managed through a single Foundry dashboard. “This is the year the agent becomes a first-class workload,” Nadella said. “From the edge to the cloud, we’re giving developers one platform, one set of APIs, one trust boundary.”
A Stack Without Gaps
The centerpiece is Microsoft Foundry, a rebranded and expanded evolution of Azure AI Studio, now natively integrating Nvidia’s AI Enterprise suite, NIM microservices, and the newly announced Nvidia AgentKit. Together, they form what the companies call the “unified agentic fabric.”
Under the hood, the stack addresses three persistent pain points for enterprise AI: heterogeneous hardware, fragmented toolchains, and trust. Windows AI PCs, which Microsoft first shipped in 2024 with dedicated NPUs, now account for over 40% of new enterprise laptop sales. Those devices can run compact agent models locally, slashing latency and keeping sensitive data on device. When a task demands heavier computation—say, a multi-step supply chain analysis involving thousands of variables—the agent can seamlessly offload to a nearby DGX Station, a deskside AI supercomputer that packs four Blackwell GPUs and up to 576 GB of unified memory. And when the workload needs hyper-scale inference or access to cloud data lakes, it can pivot to Azure’s AI-optimized instances, including those powered by Nvidia’s latest H200 and B200 GPUs.
“CIOs want the speed of local AI and the muscle of cloud AI without having to rewrite everything,” said Jensen Huang, founder and CEO of Nvidia, during a joint keynote. “The unified stack gives them that. You develop your agent once, and the runtime intelligently places workloads where they belong.”
Agentic AI for the Real World
The demonstrations focused on practical enterprise scenarios. A field-service agent for a global manufacturer ran on a technician’s Copilot+ tablet, using the camera to diagnose equipment faults. The agent’s small, quantized model (a Llama-3-Nemotron variant compressed to 7B parameters) performed image recognition directly on the device. For a tricky repair, the agent reached out to the company’s DGX Station, which hosted a full-size 70B model fine-tuned on engineering schematics. The heavier model provided step-by-step instructions in under two seconds. All interactions were logged and governed by Foundry’s central policy engine, ensuring compliance with data-residency rules.
Another demo showed a procurement agent that starts on a Windows desktop, drafts a request for proposal by querying local documents, then taps Azure-based agents from suppliers to negotiate terms. The entire workflow was orchestrated by Microsoft’s AutoGen framework, now deeply integrated into Foundry with support for Nvidia’s NeMo Guardrails for safety.
Developer Experience in Focus
A key announcement for developers was the preview of “Foundry Agent SDK,” a unified kernel accessible from Visual Studio Code, GitHub Codespaces, and directly in Windows Terminal. The SDK abstracts the underlying hardware: whether compute happens on an NPU, GPU, or CPU is decided at runtime based on policy and resource availability. Nvidia contributed its TensorRT-LLM optimizations and CUDA libraries to the Windows Subsystem for Linux (WSL), enabling the same NIM containers to run on a local DGX Station and in Azure Kubernetes Service without modification.
Microsoft also confirmed that the Copilot Runtime in Windows will expose new “Agent Graph” APIs, allowing ISVs to embed agentic flows directly into Win32 and UWP applications. This means a user could ask an Excel spreadsheet to analyze sales data and the agent could reach out to a cloud-based reasoning service without the user ever leaving the Office app.
“Our goal is to make agentic AI as invisible as a TCP/IP stack,” said Scott Guthrie, Microsoft’s EVP of Cloud and AI. “You shouldn’t have to know whether the answer came from a local SLM or a frontier model in Azure. The platform just works.”
The Deskside Supercomputer: Nvidia DGX Station
For enterprises that need air-gapped or low-latency AI, the official integration of Nvidia DGX Station with Microsoft Foundry is significant. The newest DGX Station, refreshed at GTC 2026, now comes pre-loaded with Windows Server 2025 and Azure Arc-enabled agents, allowing IT administrators to manage it as just another Azure resource. Microsoft announced a “Direct Connect” feature that establishes a private, encrypted tunnel between a Windows 11 AI PC and a DGX Station over a local network, making offload as simple as calling a REST API.
Alicia Frame, Microsoft’s Director of AI Platform Strategy, showed how a healthcare clinic could use this pairing: a patient-facing agent running on a Surface device collects symptoms, while a medical reasoning agent on the DGX Station cross-references clinical trials without patient data ever leaving the building. The DGX Station also runs Nvidia’s Riva for voice and BioNeMo for drug discovery, both surfaced through the same Foundry console.
Security and the Trust Boundary
Cross-device agentic AI introduces complex security challenges, and the partners addressed them head-on. The unified stack adopts a zero-trust architecture where every agent-to-agent or agent-to-service call must pass through an authentication and policy checkpoint. Microsoft’s new “Agent Identity Framework” integrates with Entra ID and Purview, while Nvidia’s Confidential Computing modules inside DGX and Azure GPU instances ensure data is encrypted even during processing.
A notable detail: locally generated embeddings and intermediate results stay on the origin device unless explicit user consent (or administrator policy) permits sharing. This addresses concerns that agentic workflows could inadvertently leak proprietary data when switching between environments.
Competitive Landscape
The move places Microsoft and Nvidia in direct competition with Google’s Vertex AI Agent Builder and Amazon’s Bedrock Agents, both of which also pitch cross-environment deployment. However, the Windows endpoint advantage and Nvidia’s hardware portfolio give the unified stack a different dimension. Apple, which has been quietly building on-device agent capabilities through its Neural Engine and a rumored “Apple Intelligence” framework, lacks an equivalent cloud-to-edge narrative for enterprises tied to Microsoft 365.
Analyst firm Gartner noted in a brief reaction that “the combined Microsoft-Nvidia offering lowers the barrier for enterprise agentic AI, but successful adoption will depend on how quickly the ecosystem of ISVs and system integrators can retool around Foundry.”
What’s Available and When?
Several components are ready now, while others hit preview later this year. Key dates:
- Microsoft Foundry is generally available as of Build 2026, with the unified agentic experience in public preview.
- Nvidia AgentKit enters early access next month; it includes pre-built connectors for ServiceNow, SAP, and Salesforce.
- DGX Station Direct Connect will ship with the next DGX Station firmware update, expected in Q3 2026.
- Windows Agent Graph APIs will arrive in the Windows 11 24H2 update (preview build 26100.xxxx).
- AutoGen + NeMo Guardrails integration is now available in the Foundry agent builder.
Pricing for Foundry’s agentic tier metering is not yet disclosed, but Microsoft indicated it will follow a consumption model with a free tier for development on local devices.
The Bigger Picture
The announcement at Build 2026 cements a relationship that has evolved from providing GPUs in Azure to co-engineering a full-stack AI operating system. For developers, it means spending less time on infrastructure and more on building intelligent agents. For IT, it promises a single pane of glass for governance. And for end users, it heralds a future where AI agents flit between devices so seamlessly that the hardware feels irrelevant—only the task matters.
As Nadella closed his keynote, he gestured toward the demo of an agent responding to a natural-language query about Q3 earnings. “This isn’t a chatbot. It’s a reasoning engine that understands context, permissions, and intent. And it works the same on a tablet, a workstation, or a cloud server. That’s the unification we promised three years ago. Now it’s real.”
Industry watchers will be looking for concrete adoption metrics at next year’s Build, but the direction is unmistakable: agentic AI has a new home, and it’s built on Windows, Nvidia, and Azure.