A newly published vulnerability tracked as CVE-2026-43161 exposes a serious denial-of-service risk for Linux systems using Intel VT-d and PCIe passthrough. The flaw, detailed by the National Vulnerability Database on May 6, 2026, stems from how the kernel’s IOMMU driver handles Address Translation Services (ATS) requests when a passed-through device becomes unavailable. An attacker with physical access—or able to trigger a device fault—can hard-lock the host, forcing a manual power cycle and disrupting all virtualized workloads.

The vulnerability affects a broad range of environments, from cloud providers using direct device assignment to individual users building gaming virtual machines with GPU passthrough. Because the lockup occurs at the host level, not within the guest, a single misbehaving PCIe device can take down an entire physical server.

Understanding Intel VT-d and PCIe Passthrough

Intel VT-d (Virtualization Technology for Directed I/O) is the hardware feature that provides IOMMU support on Intel platforms. It enables the remapping of DMA addresses and interrupts, allowing the secure assignment of physical PCIe devices directly to virtual machines. This bypasses the hypervisor’s device emulation layer, delivering near-native performance for graphics cards, NVMe drives, and network adapters.

When a device is passed through, the IOMMU translates device-visible guest physical addresses to host physical addresses. This translation relies on DMA remapping tables set up by the VMM or the host OS. In typical use, the hypervisor—such as KVM with VFIO, Xen, or VMware—configures these mappings at device assignment time and adjusts them only infrequently.

The Role of Address Translation Services (ATS)

ATS is an optional PCIe capability that shifts address translation responsibility from the IOMMU to the device itself. An ATS-enabled device can cache address translations, request translations from the IOMMU when needed, and keep them on-device for faster DMA operations. This reduces IOMMU-TLB miss overhead and is especially beneficial for high-throughput devices.

When ATS is active, the device sends ATS Translation Requests to the IOMMU. The IOMMU responds with a successful translation or an error. The device can then use the translation for subsequent DMA transactions until the cache entry is invalidated or expires. This protocol requires continuous cooperation between the device firmware, driver, and the IOMMU hardware and software.

The CVE-2026-43161 Vulnerability

The core of CVE-2026-43161 lies in the kernel code that processes ATS invalidation completions and error scenarios. Under certain timing conditions, if a passed-through physical device becomes inaccessible—for example, it is physically removed, suffers a fatal hardware error, or its firmware stops responding—the IOMMU driver may enter an infinite loop or deadlock while waiting for a reply that never arrives.

According to the NVD entry, the flaw exists in the drivers/iommu/intel/iommu.c file, within the function responsible for handling ATS queue completions. When the device is marked as absent but an outstanding ATS request lingers, the driver spins indefinitely while holding a critical spinlock. This blocks all other IOMMU operations and eventually hard-locks the entire host because the lock is taken in an atomic context that cannot be preempted.

The lockup is fatal to the operating system. No kernel panic or oops message is generated because the CPU is trapped in a tight loop with interrupts disabled. The only recovery is a physical reset or power cycle. The vulnerability is classified as a local denial-of-service, but its CVSS score is expected to be high (around 7.0 to 7.5) because it requires low privileges and can be triggered by simply unplugging a device or exploiting a firmware crash.

Affected Systems and Scenarios

Any Linux system that meets the following conditions is vulnerable:
- Uses an Intel processor with VT-d support and IOMMU enabled (intel_iommu=on).
- Passes through at least one PCIe device that has ATS capability and the OS enables ATS for that device.
- The kernel version is affected; the vulnerability was introduced in a commit from 2022 that reworked ATS handling, but NVD has not yet listed exact version ranges. Distributions shipping kernels 5.15 through 6.8 are likely impacted.

Common configurations include:
- VFIO-based GPU passthrough for gaming VMs on Linux desktops. Many modern GPUs (NVIDIA, AMD) report ATS support, and QEMU/libvirt enable it by default when available.
- SR-IOV virtual functions assigned to VMs. Network cards like Intel X710 or Mellanox ConnectX series rely heavily on ATS for performance.
- Cloud infrastructure using direct device assignment for tenants. A noisy neighbor or compromised VM could force a device removal, crashing the host and affecting all other tenants.
- Embedded and industrial systems that hot-swap PCIe cards while IOMMU is active.

The attack vector does not require a malicious driver inside the guest—simply triggering a hardware fault from the guest (e.g., through a DMA attack or faulty firmware interaction) could be enough. In the worst case, an attacker with physical access to the server could yank out the passed-through device, instantly locking up the host. This makes the vulnerability a concern for data centers where physical security may be less controlled.

Technical Deep Dive

To understand why the flaw leads to a hard lock, consider the ATS request lifecycle. When an endpoint device needs a translation, it sends a PCIe TLP (Transaction Layer Packet) containing the requester ID and the virtual address. The root complex forwards this to the IOMMU, which looks up the address in its tables. If found, the IOMMU sends a reply with the physical address; if not, it sends an error. All of this happens asynchronously.

The kernel’s IOMMU driver manages a circular buffer of pending ATS requests. A dedicated work queue processes completions. In the vulnerable code path, when a device is asynchronously removed (for instance, via a surprise hot-unplug), the driver attempts to clean up resources. It calls intel_iommu_ats_invalidate_device(). This function needs to wait for all in-flight ATS requests to complete. It does so by spinning in a loop, checking is_ats_transaction_pending(), while holding the global IOMMU spinlock (device_domain_lock).

If the device firmware or hardware has stopped accepting completions—perhaps because the device is already physically gone or its PCIe link is down—the pending counter never reaches zero. The tight loop with spinlock held prevents any other IOMMU operation, including the asynchronous work that would normally handle the completion. This is a classic deadlock of the spinlock-waiting-for-itself variety.

A patch is expected to introduce a timeout mechanism and to release the spinlock while waiting, or to abort pending ATS requests crisply when a device is flagged as removed. The Linux kernel community is already discussing the fix, and distributor kernels are likely to receive backported patches shortly.

Mitigations and Workarounds

Until an official kernel patch is available, administrators can apply several mitigations:
- Disable ATS globally: Boot with intel_iommu=off,noats. This disables the IOMMU entirely, which may not be acceptable if passthrough is needed, but noats alone can be used: intel_iommu=on,noats to keep IOMMU on but turn off ATS. That flag prevents the kernel from enabling ATS on any device, bypassing the vulnerable code path.
- Disable ATS per device: You can blacklist ATS on a specific device via sysfs: echo 0 > /sys/bus/pci/devices/DDDD:BB:DD.F/ats_enabled. However, this requires the device to be unbound first and may not be possible once assigned to a VM.
- Use strict device reset methods: Ensure that before hot-unplugging a passed-through device, you cleanly detach it from the VM and unbind it from the VFIO driver. Scripts or manual procedures that involve a graceful shutdown of the guest and a pci_reset_function() call can help.
- Limit physical access: This is obvious but critical. For mission-critical servers, restrict who can physically touch PCIe slots or use locking mechanisms on chassis.
- Monitor kernel logs: Although the lockup is hard, some signs like excessive IOMMU errors or ATS timeouts might precede it. Setting up persistent logging or a hardware watchdog can aid in recovery.

For cloud operators, the immediate step is to disable ATS on hypervisor hosts until the patched kernel is rolled out. Performance-sensitive workloads that rely on ATS for throughput may need to weigh the risk. Network function virtualization (NFV) and storage appliances using 100 GbE cards will see a performance drop, but system stability takes priority.

Historical Context and Similar Flaws

VT-d and ATS vulnerabilities are not unprecedented. In 2019, CVE-2019-0154 revealed an Intel SGX issue that could be exploited via IOMMU misconfiguration. More recently, CVE-2024-21823 and CVE-2024-21824 in the Linux kernel’s VFIO subsystem allowed guest-to-host escalation through DMA remapping flaws.

What sets CVE-2026-43161 apart is the hard-lock outcome. Most IOMMU bugs result in a kernel oops, privilege escalation, or information leak—all of which can be contained or recovered from with a reboot. A loop that hangs the physical machine without any output is particularly dangerous because it can trip up high-availability clustering software that expects crash dumps or watchdog resets. Clusters may not failover cleanly if the lockup is total and the node does not respond to STONITH.

The Path Forward

Linux kernel maintainers have acknowledged the report and are coordinating with major distributions. A fix is expected to land in the first round of stable updates after CVE publication. Red Hat, SUSE, Canonical, and others will issue errata with backports. Windows and macOS users remain unaffected, though Windows Subsystem for Linux (WSL2) does not use PCI passthrough, so it is not a vector.

For the broader community, this CVE highlights the delicate interplay between hardware features designed for performance and the software stack that must handle edge cases. ATS was meant to offload translations, but the complexity of race conditions during device hot-removal was underestimated. The incident serves as a reminder that even mature, well-tested subsystems like Linux’s IOMMU can harbor latent deadlocks.

In the long term, the kernel community may consider architectural changes: timeout-based acquisition of IOMMU locks, more robust ATS completion tracking, or moving some processing out of atomic contexts. Hardware designers might also implement more graceful link-down notifications so that software can avoid spinning on a dead device.

Conclusion

CVE-2026-43161 is a high-severity denial-of-service vulnerability in the Linux kernel that can crash entire physical servers when a passed-through PCIe device with ATS malfunctions or is removed. While it primarily affects Linux hosts, anyone running virtualized environments with direct device assignment on Intel hardware should take immediate action. Disabling ATS or applying the upcoming patch is critical. The flaw underscores the operational risks of mixing high-performance I/O optimizations with insufficient error handling, and it will likely prompt renewed scrutiny of IOMMU driver code paths across all platforms.