Anthropic’s Claude AI assistant family is now running on Microsoft Azure for the first time, powered by NVIDIA’s next-generation GB300 Blackwell Ultra GPUs and delivered through the Microsoft Foundry platform. The launch, confirmed on Monday, June 29, 2026, marks the first deployment of Claude on NVIDIA hardware and gives Azure customers direct access to one of the industry’s most safety-focused large language models, tightly coupled with agentic AI capabilities and enhanced security features.
The move comes as competition in cloud-based enterprise AI intensifies, with Microsoft balancing its deep ties to OpenAI against a growing demand for model diversity and specialized AI workloads. By onboarding Claude through Foundry, Microsoft is signaling that Azure is the platform for any frontier model, not just GPT.
What’s new: Claude on NVIDIA GB300
Anthropic’s previous cloud footprint relied on custom-designed inferencing infrastructure, often running on Google Cloud TPUs. This shift to NVIDIA’s upcoming GB300 Blackwell Ultra GPU introduces a massive leap in computational efficiency and model serving capacity. The GB300, sampling now and expected to ship in volume later in 2026, pairs a powerful GPU with 288 GB of HBM4 memory and a 2 TB/s NVLink domain, allowing Claude’s largest models to run inference with dramatically lower latency and higher throughput.
The integration with Microsoft Foundry—a unified MLOps and governance layer that sits on top of Azure AI—means organizations can fine-tune, evaluate, and deploy Claude models alongside other AI services like Azure OpenAI Service, Meta Llama, and open-weight options, all within corporate compliance boundaries. “Enterprises shouldn’t have to choose between models or clouds,” a Microsoft spokesperson said during a brief pre-brief. “Foundry is the answer to that fragmentation.”
Microsoft Foundry, which exited private preview in late 2025, offers a single pane for managing model catalogs, vector indexes, prompt flow, and responsible AI filters. Adding Claude is a significant milestone, as it brings Anthropic’s constitutional AI alignment approach into the same governance framework that already supports GPT-5o and other models.
Agentic AI and the security promise
Perhaps the most underappreciated aspect of the announcement is the emphasis on agentic AI security. Claude’s agentic capabilities—where the model can execute multi-step plans, interact with tools, and reason autonomously—require a different security posture than simple chat-based interactions. The GB300 hardware features NVIDIA Confidential Computing with full memory encryption and attestation, which Anthropic and Microsoft are leveraging to create an end-to-end trusted execution environment for agentic workflows.
This means that when a Claude agent is processing sensitive data—say, adjusting supply chain parameters for a manufacturing giant—the inference is run inside a hardware-isolated enclave that even Azure administrators cannot inspect. Microsoft Foundry then layers role-based access controls, prompt injection guards, and model behavioral boundaries that prevent the agent from taking dangerous actions or disclosing proprietary information.
“We’re seeing agentic AI as the next frontier, but it’s also the next attack surface,” said an Anthropic security engineer during a closed-door session at a recent developer conference. “Running on NVIDIA’s confidential compute with Foundry’s policy engine gives us a lockbox for autonomy that was simply not possible on older silicon.”
Why this matters for enterprises
For Azure-focused organizations, Claude on Foundry eliminates the friction of juggling separate cloud contracts and data egress fees. A wealth management firm using Azure for data warehousing can now call Claude for document analysis, have Claude orchestrate a data extractor, and then log the entire chain of actions in Azure Purview for compliance. That same firm can simultaneously use OpenAI o1 for quantitative scenarios and Llama 4 for document summarization, all billed through a single Azure commitment.
Pricing details were not fully disclosed, but an early price sheet seen by Windows News indicates pay-as-you-go tokens roughly 15% lower than the previous on-demand Claude 4 pricing, thanks to the efficiency gains of the GB300. Reserved capacity commitments offer even steeper discounts, and existing Microsoft volume licensing customers can use Azure Consumption Commitment to draw down on their Claude usage—a clear shot at AWS Bedrock and Google Vertex AI, which have longstanding Claude integrations but lack the same unified billing flexibility.
Developers will find Claude in the Foundry model catalog as “claude-opus-4-gb300” and “claude-sonnet-4-gb300,” with Haiku and a new “claude-vertex” multimodal model coming later in Q3. APIs are fully compatible with the standard Anthropic Messages API, but Foundry adds streaming via Azure Event Grid, private link endpoints, and out-of-the-box monitoring with Azure Monitor.
The multi-model reality bites
Microsoft’s relationship with OpenAI remains exclusive for model training on Azure, but the inference landscape has become decidedly multi-portfolio. By adding Claude to Foundry, Microsoft is acknowledging that no single model family will dominate every enterprise scenario. Anthropic’s models are particularly strong in long-context reasoning, legal analysis, and coding tasks where safety and hallucination reduction are paramount.
“GPT is brilliant for creativity and brainstorming, but Claude has an edge when you need 200k-token context windows with near-perfect fidelity,” said a CTO of a large insurance company who tested both models side by side. “Having both on the same cloud, with the same security wrappers, makes the decision a matter of pick the right tool, not pick the right vendor.”
This trend mirrors what happened in databases and programming languages: enterprises want a curated set of options that work together. Foundry’s prompt flow studio even lets you build applications that route tasks to different models based on intent detection—a query about a 10-K filing goes to Claude Opus, while a request to draft a marketing email goes to GPT, seamlessly.
Hardware partnership signals deepening NVIDIA-Azure ties
The choice of NVIDIA for Claude’s first non-custom silicon deployment is notable. Anthropic has been a close partner of Google for compute, but NVIDIA’s GB300 brings a generational leap in transformer engine acceleration and FP8 compute, plus native sparsity support that slashes the cost per token for mixture-of-experts models like Claude’s next-gen architecture. Microsoft, meanwhile, has been the lead cloud partner for GB300, with Azure already running internal workloads on the chips since early 2026.
This hardware foundation may accelerate Claude’s roadmap toward even more ambitious agentic systems. Anthropic recently published research on “constraint amplification,” a technique where an agent self-critiques and course-corrects during execution. That kind of online self-improvement demands substantial headroom in memory and compute—exactly what GB300 provides with its 2:1 sparsity advantage and larger HBM pool.
NVIDIA CEO Jensen Huang said in a statement, “The GB300 Blackwell Ultra GPU is built for the agentic AI era. Having Claude running on it in Azure Foundry shows what’s possible when the world’s safest AI meets the world’s most powerful compute.”
The integration also positions Microsoft as a distribution powerhouse for AI. Customers can now subscribe to Claude as a service, or even deploy dedicated GPU clusters using Azure’s ND GB300 v6 instances for fully isolated, single-tenant Claude inferencing—an option that will appeal to banks and defense contractors with stringent data sovereignty requirements.
Security at the silicon level
The mention of agentic AI security in the launch materials is not accidental. Agentic AI allows models to execute code, call APIs, and manipulate data—actions that could be catastrophic if prompted maliciously. The NVIDIA GB300’s confidential computing capabilities, combined with Microsoft’s recently released “Guardian” framework for agent oversight, give enterprises a hard boundary. Guardian logs every tool call, applies real-time policy checks, and can terminate an agent session mid-execution if it detects anomalous behavior.
“We ran a red-teaming exercise where a Claude agent was asked to transfer funds between accounts,” said a lead engineer at a Fortune 500 bank that participated in the Foundry early access program. “The GB300 enclave isolated the inference, Guardian flagged the request because it violated a governance rule, and the whole thing was shut down in 18 milliseconds. That’s the kind of safety we need.”
This approach could become a template for the rest of the industry. OpenAI, Google, and startups like Cohere are all pushing agentic frameworks, but the NVIDIA-Microsoft-Anthropic triad is the first to offer a full hardware-to-software-to-policy stack that’s commercially available.
What’s next
Starting today, Azure subscribers in East US, West Europe, and Southeast Asia regions can access Claude models via Foundry. General availability in all 60+ Azure regions is expected by end of Q3 2026. An introductory tier gives developers 10,000 free inference tokens per day for the first month, and Microsoft is offering a “Claude Accelerator” program that includes credits, architecture reviews, and joint support from Microsoft and Anthropic engineers.
The partnership is likely to expand beyond inference. Anthropic and Microsoft are already collaborating on fine‑tuning recipes optimized for the GB300, and there are whispers of a joint “AI safety copilot” that runs locally in Foundry and uses Claude to audit other models’ outputs for accuracy and harm. Such a tool would underscore the strategic alignment: Anthropic brings safety-by-design; Microsoft provides the scalable, governed platform; and NVIDIA delivers the iron to make it all hum at a cost that works at global scale.
In a market where AI announcements seem to drop daily, this one carries weight because it redefines how enterprises will think about AI infrastructure—not as a binary between proprietary and open, but as a spectrum where the best-in-class solutions can coexist under one umbrella. For Windows and Azure developers, that means fewer barriers to experimentation, faster time to value, and a level of security that makes agentic AI something you can actually trust in production.