Nvidia and Microsoft today unveiled the RTX Spark, a radical new breed of Windows PC built to run personal AI agents entirely on-device. Jointly announced at Computex and elaborated at Microsoft Build 2026, the RTX Spark brings datacenter-grade AI processing to the desktop, with no internet connection required.

Under the hood sits a Grace Blackwell-derived superchip—a tightly integrated CPU–GPU combo linked by NVLink-C2C and sharing up to 128 GB of unified memory. It’s a deliberate scaling-down of the architecture that powers Nvidia’s GB200 server racks, repackaged for the workstation market. The result is a machine that can load massive language models, diffusion networks, and multi-modal agents directly into a single memory pool, bypassing the PCIe bottlenecks that have long throttled AI workloads on PC hardware.

The hardware: Grace meets Blackwell in a single socket

The superchip fuses Arm-based Grace CPU cores with a Blackwell-class GPU, similar to the Grace Hopper and Grace Blackwell superchips already deployed in data centers. The CPU and GPU share the full 128 GB of LPDDR5X-grade memory over a wide, low-latency bus. By comparison, even a flagship desktop GPU like the RTX 5090 has only 32 GB of dedicated VRAM, and system RAM remains separated by the PCIe interface. Unified memory eliminates the need to shuttle data back and forth; a developer can allocate a 100B-parameter model once and let both the CPU and GPU work on the same tensor slices concurrently.

During his Computex keynote, Nvidia CEO Jensen Huang held up a prototype RTX Spark board, slightly larger than a Mini-ITX motherboard but packing the compute equivalent of two RTX 6000 Ada GPUs. The reference design sits in a compact tower with a custom vapour-chamber cooler and ships with a 650 W power supply. Partners including ASUS, MSI, and Dell have confirmed they will ship retail variants later this year.

AI agents that live on your PC

Microsoft used its Build conference sessions to detail the software side. Windows 11’s “Sequoia” update, scheduled for Q3 2026, introduces an AI Hosting Runtime that lets third-party agents run as sandboxed background services. These agents can see your screen, read local files, interact with applications via UI Automation, and even control peripherals—all gated by a new, fine-grained permissions model.

A personal travel agent, for instance, could monitor flight prices, cross-reference your calendar, and book a ticket in Edge without ever sending your payment details to the cloud. A developer agent could continuously review code in a local repo, suggest refactors, and open pull requests in Visual Studio Code, using an on-device Llama 4 or a proprietary Microsoft model. Nvidia’s RTX Spark reference image ships with an agent runtime based on the same NeMo framework used in enterprise deployments, complete with TensorRT-LLM acceleration for the most popular open-weight models.

Security anchored in silicon

Because all processing happens locally, user data never leaves the machine. That structural privacy advantage is reinforced by a new hardware root of trust Microsoft calls “Athena.” Athena pairs the existing Pluton security processor with a dedicated AI attestation engine. Every AI agent must boot from a signed, encrypted vault, and the Athena engine continuously measures the model weights and agent logic against golden hashes stored in immutable flash. If any tampering is detected—even by a kernel-mode driver—the agent is instantly suspended and the user alerted.

Athena also underpins a Trusted AI Execution Environment (TAI-EE) that isolates agent memory from the operating system. Even a compromised Windows image cannot scrape the agent’s context, making RTX Spark the first consumer PC to achieve Confidential Computing alliance certification for AI workloads. This is a direct response to enterprises that have been reluctant to deploy AI assistants because of data-leakage fears, and it positions the RTX Spark as a viable option for regulated industries like healthcare and finance.

The Windows AI stack comes into focus

Build 2026 sessions made clear that Microsoft is betting the future of Windows on local AI. The AI Hosting Runtime is backed by a new semantic orchestration layer—think of it as an operating system for agents. Developers write agents once against a common API and can deploy them on any Windows PC; the runtime automatically profiles the underlying hardware and schedules compute across CPU, GPU, and NPU tiles. On an RTX Spark, of course, the GPU is the workhorse, but the same agent will run on a Snapdragon X Elite laptop, scaled to the available resources.

Microsoft also confirmed that the Copilot key, first introduced in 2024, will become the launch point for personal agents. A long press opens a fly-out that lists all installed agents, their current status, and any pending actions requiring user consent. The experience feels like a reboot of the 2017-era Cortana ambitions, except now the technology can actually deliver.

Real-world performance: moving the goalposts

During a closed-door demo at Build, an RTX Spark running a 70B-parameter model generated complex 3D scenes from text prompts at 12–15 frames per second—fast enough for interactive editing. A second demo showed a financial analyst agent processing a 200,000-row Excel spreadsheet: it identified anomalies, wrote VBA macros to fix them, and generated a PowerPoint summary, all in under 90 seconds. Traditional cloud-based Copilot solutions would have struggled with the dataset size and the multi-step workflow.

Nvidia’s early benchmarks, as shown at Computex, put the RTX Spark at roughly 60 TOPS of sustained AI throughput under a mixed workload of inference and light fine-tuning. That’s more than double what current-gen mobile NPUs deliver and approaches the throughput of a single A100 GPU from just a few years ago.

Competition and the new PC battleground

The RTX Spark is not entering an empty field. Apple’s M4 Ultra Mac Studio already offers up to 192 GB of unified memory and can run quantized large models on-device. Qualcomm’s Snapdragon X Elite platform, with its 45-TOPS NPU, powers a growing fleet of AI-capable Windows laptops. Yet both lack Nvidia’s software heritage; CUDA remains the de facto standard for AI development, and the RTX Spark ships with a full AI-workbench stack—PyTorch, TensorFlow, JAX, and Nvidia’s own Llama 4 optimizations pre-configured and ready to go.

Analysts see the Spark as Nvidia’s hedge against a future where AI moves from the data center to the edge. “If every professional has a desktop that can fine-tune a model overnight, the need for cloud GPU instances drops,” said Moor Insights & Strategy analyst Patrick Moorhead at Computex. “Nvidia would rather own that edge device than let a rival capture it.”

Pricing, availability, and the prosumer gamble

Nvidia has not disclosed final pricing, but the reference board’s bill-of-materials alone suggests a starting price of $4,500–$5,500 for a 64 GB variant, with the 128 GB SKU easily crossing $6,500. That puts the RTX Spark squarely in Mac Studio and high-end Z-series workstation territory. First shipments are promised for October 2026, with pre-orders opening after the Build conference.

Such pricing will limit the initial audience to developers, data scientists, and video professionals—exactly the crowd that already pays a premium for performance. But Microsoft and Nvidia both hinted at a consumer-focused “RTX Spark Lite” for 2027, likely with 32–64 GB of memory and a trimmed-down GPU. That could bring the concept to a $1,500–$2,000 price point, close to a premium gaming laptop.

What RTX Spark means for the Windows ecosystem

The introduction of a new, high-end hardware segment creates a forcing function for software. Microsoft has already begun seeding an AI developer kit that includes agent templates, pre-trained models, and access to Windows’ new GraphRAG APIs—tools that let agents understand relationships across your personal data without uploading it. If enough developers adopt the platform, Windows could leapfrog macOS in AI integration within a single product cycle.

For IT departments, the combination of local AI and hardware-enforced security could finally make AI assistants compliant with internal data-handling policies. A number of large enterprises, including Accenture and Siemens, announced pilot programs at Build, with plans to deploy RTX Spark workstations for code generation, contract analysis, and R&D tasks.

The road ahead

The RTX Spark is the most ambitious attempt yet to bring the data center AI experience to the desk. It promises genuine data sovereignty, instant responsiveness, and the ability to keep working when the network goes dark. The open question is whether the software ecosystem—and the even more critical agent-use-case ecosystem—can mature quickly enough to make a $5,000 PC feel indispensable. If Nvidia and Microsoft can catalyze that flywheel, the Spark won’t just be a new product category; it could be the foundation of the next 30 years of personal computing.