Nvidia's DGX Cloud Goes Internal: Lepton Marketplace Takes Over External AI Compute Strategy

{
"title": "Nvidia's DGX Cloud Goes Internal: Lepton Marketplace Takes Over External AI Compute Strategy",
"content": "Nvidia has quietly repositioned its DGX Cloud from a premium enterprise AI supercomputer service to primarily an internal research resource, while pivoting its external compute strategy to the Lepton marketplace, according to multiple reports and financial disclosures. The shift, which ends a direct head-to-head competition with cloud giants like AWS and Microsoft Azure, reflects a maturing AI compute economy where orchestration trumps ownership.

The original DGX Cloud bet: premium pricing in a scarcity market

When Nvidia launched DGX Cloud in March 2023, it pitched the product as an all-in-one path to enterprise-grade AI supercomputing. Each instance bundled eight H100 or A100 GPUs, integrated with Nvidia’s AI software and Base Command orchestration, and came with a monthly price tag starting at $36,999 per instance. At the time, GPU shortages made that premium palatable for customers desperate for dedicated capacity. Nvidia itself described DGX Cloud as “your own AI supercomputer in the cloud,” positioning it as a direct alternative to building on-prem clusters or renting from hyperscalers.

But the economic landscape shifted rapidly. Supply constraints eased as chip production ramped and inventory management improved across the ecosystem. Meanwhile, hyperscalers — which had been building out massive GPU fleets — started slashing prices to attract customers. In mid-2025, AWS announced reductions of up to 45% across its NVIDIA-powered EC2 GPU instances, making spot and on-demand capacity from major clouds materially cheaper than Nvidia’s standalone DGX Cloud list price. Those price corrections, combined with improving availability, undermined the scarcity-based value proposition that had made DGX Cloud viable.

What changed: from customer cloud to internal compute pool

According to insider reports cited by The Information and Tom’s Hardware, Nvidia has now moved most DGX Cloud capacity to internal research and no longer actively markets it as a standalone customer product. The change is not a formal product cancellation — DGX Cloud still appears in revenue categories — but its role has clearly shifted. In Nvidia’s financial results for the second quarter of fiscal 2026 (ending July 2025), the company no longer attributes multibillion-dollar cloud spend commitments to DGX Cloud, a disclosure it had included in prior quarters. This omission, paired with insider accounts, suggests a deliberate retreat.

Why the pivot? The math stopped working. Charging $36,999 per month per instance made little sense when a customer could rent similar H100 capacity from AWS for hundreds of dollars per hour — with even lower effective rates through reserved instances or savings plans. Nvidia also faced an awkward channel conflict: it was renting GPU capacity from neocloud providers like CoreWeave and Lambda, then subleasing it to enterprises, effectively competing with the very hyperscalers that buy its chips by the thousands. That tension with AWS and Microsoft, both key customers, was unsustainable.

Instead, Nvidia appears to be using its DGX Cloud footprint — a strategically sized pool of the latest GPU clusters — for internal R&D. This allows the company to accelerate chip design, model research, and performance validation without bearing the operational burden of a full-fledged customer-facing cloud. It also keeps the “DGX” experience as a showcase for elite internal workloads while shifting outward-facing demand capture to a new model.

DGX Cloud Lepton: the new outward-facing play

In May 2025, Nvidia announced DGX Cloud Lepton, a compute marketplace that aggregates GPU capacity from a broad set of partners and routes workloads to the optimal provider. Unlike DGX Cloud — where Nvidia operated as a quasi-cloud provider itself — Lepton acts as a traffic controller. Developers access a unified interface and tap into GPU supply from traditional hyperscalers (AWS, Microsoft Azure), neocloud players (CoreWeave, Crusoe, Lambda), regional providers, and others. Nvidia integrates its full software stack — NIM microservices, NeMo frameworks, Blueprints, and Base Command — to ensure a consistent development and operations experience regardless of where the underlying compute resides.

The pitch is simple: developers get access to thousands of GPUs across providers, with predictable performance and the flexibility to meet sovereignty, latency, and cost requirements. Providers get a channel to global demand without having to build their own marketplace. And Nvidia gets to own the customer relationship and software layer, capturing value without the capital-intensive work of running data centers.

At launch, Nvidia named a roster of partners including CoreWeave, Crusoe, Lambda, Nebius, and SoftBank. It explicitly signaled that the largest cloud providers would participate as compute partners. The move was widely seen as a strategic pivot from operating a direct cloud service to orchestrating GPU demand at scale, a shift that preserves Nvidia’s ecosystem centrality while defusing hyperscaler tensions.

Financial and disclosure signals

Nvidia’s Q2 fiscal 2026 earnings commentary highlighted massive Data Center revenue, driven by Blackwell adoption, but the language around cloud commitments told a different story. In previous quarters, the company had pointed to multibillion-dollar cloud spend obligations tied to DGX Cloud. In the latest filing, that line vanished. While Nvidia still lists DGX Cloud in revenue categories, the omission strongly implies that the service is no longer a growth priority for external customers.

The change must be read in context: Nvidia’s overall Data Center business continues to surge, and the company remains the dominant supplier of AI accelerators. The pivot away from DGX Cloud as a customer-facing product reduces near-term operational complexity and capital expenditure, while potentially freeing up resources for higher-margin software and services. But it also signals that Nvidia is choosing not to fight a price war with hyperscalers on their home turf.

Strategic analysis: gains for Nvidia

By repositioning DGX Cloud and doubling down on Lepton, Nvidia captures several strategic advantages:

Demand funnel control: Developers and enterprises continue to start with Nvidia’s software and frameworks. Lepton keeps procurement and usage inside the Nvidia ecosystem, even when the compute runs on AWS or CoreWeave. Nvidia’s SDKs, NIM microservices, and Blueprints remain the standard development layer.
Partner alignment: Lepton helps smaller neocloud providers stay commercially viable, giving them global visibility and integrated tooling. At the same time, it lowers friction with hyperscalers by removing the direct competitor label. AWS and Azure can participate in Lepton without feeling threatened, because Nvidia is no longer trying to be a cloud provider.
Margin focus: Owning the platform and software is inherently higher-margin than building and operating global data centers. Nvidia monetizes its software stack — including AI microservices, model-optimized libraries, and orchestration tools — while leaving the infrastructure ownership to partners.
R&D acceleration: Retaining a pool of DGX systems for internal use gives Nvidia dedicated, ultra-high-performance infrastructure for chip testing, model fine-tuning, and software-hardware co-design. This is especially valuable as it develops next-generation platforms like Rubin and CPX.

The shift is not without peril:

Hyperscaler détente is fragile: Large cloud providers are investing heavily in custom accelerators — AWS Trainium, Google TPUs, and proprietary ASICs — to lower cost and reduce dependency on Nvidia. If they succeed in making these alternatives mainstream, Nvidia’s ecosystem could erode, and Lepton’s role as a neutral marketplace would be challenged.
Marketplace complexity: Running a neutral, multi-provider marketplace requires seamless integration, fair allocation, clear SLAs, and consistent performance across heterogeneous backends. Any breakdown in user experience — unpredictable performance, billing surprises, or downtime — could quickly undermine trust.
Geopolitical and export constraints: Nvidia’s chips are subject to export controls; the company has already recorded inventory charges related to H20 product availability and faced constrained shipments to China. Such restrictions could fragment Lepton’s global reach, limiting which GPUs can be routed to certain jurisdictions.
Perception versus reality: If customers interpret the DGX Cloud retreat as a concession, it may weaken enterprise confidence in Nvidia’s long-term cloud roadmap. Conversely, if Lepton fails to attract hyperscalers on favorable terms, Nvidia risks becoming an intermediary with limited pricing power.

Implications for developers and enterprises

For developers, the practical upside of Lepton is clearer access to GPU capacity in the clouds they already use, with the promise of lower costs as hyperscalers compete on price. Hugging Face has already announced a Training Cluster as a Service that leverages Lepton, signaling practical traction. The marketplace should simplify multi-cloud training and inference jobs, reducing the operational overhead of stitching together different providers.

However, Lepton also adds complexity to procurement. Teams must now evaluate pricing, regional availability, SLAs, security posture, and data sovereignty constraints across a varied roster of providers. The days of paying a premium for a single Nvidia-managed DGX supercomputer are fading; instead, customers will need to be savvy shoppers in a more fragmented, competitive market.

Partner dynamics: hyperscalers and neoclouds

Hyperscaler participation in Lepton is pragmatic but conditional. AWS and Azure gain incremental demand by letting Nvidia route workloads to their infrastructure. Yet they have strong incentives to steer customers toward their own native AI services, custom silicon, and long-term commitments. The depth of their integration with Lepton will be telling — if they participate under neutral, interoperable terms, Lepton gains credibility. If involvement is superficial, the marketplace may falter.

For neocloud providers, Lepton is a strategic lifeline. It offers global visibility and a developer reach they could not easily achieve alone. Nvidia provides packaging, diagnostics, and software integration, making these smaller players more competitive against the hyperscalers. This diversity benefits the ecosystem and reduces risk for enterprises worried about single-point dependency on a single cloud.

Technical and operational implications

On the engineering side, Lepton’s promise of unified GPU health monitoring, automated root-cause analysis, and integration with Base Command is significant. If executed well, it could dramatically reduce the operational overhead of multi-cloud training jobs and help maintain predictable performance across different vendor backends. That consistency is a key differentiator versus a purely price-driven brokerage.

For procurement and finance teams, the shift means new planning models. Organizations should anticipate blended pricing that combines hardware rental, Nvidia software licenses or credits, and partner margins. Capital planning will move from treating Nvidia as a cloud vendor to treating it as a software and marketplace partner, requiring new contract structures and multi-provider forecasting.

What to watch next

Several developments will indicate whether Nvidia’s pivot succeeds:

Disclosures: Future earnings and SEC filings will be examined for language about cloud commitments, any breakout of DGX/Lepton revenue, and the overall Data Center narrative.
Hyperscaler integration: The depth and terms of AWS, Azure, and Google’s participation in Lepton will signal the marketplace’s viability.
Price trends: Continued GPU price declines and the adoption curve for in-house silicon at hyperscalers will shape Nvidia’s competitive moat.
Export controls: Regulatory shocks could restrict GPU routing, fragmenting the marketplace’s global utility.
Developer adoption: Real-world metrics on transaction volumes, orchestration performance, and billing satisfaction will determine whether Lepton lives up to its promise.

Bottom line

Nvidia’s quiet repurposing of DGX Cloud and its aggressive push with Lepton represent a classic consolidation of strengths: own the platform and developer experience, outsource the capital-intensive infrastructure. By removing channel conflict with hyperscalers, Nvidia secures its