Phison-Based NVMe SSDs Disappear After Windows 11 KB5063878 Update, Users Warn of Data Corruption

Microsoft’s August 12, 2025 cumulative update for Windows 11 24H2 (KB5063878) has been linked to a severe storage regression that can cause NVMe SSDs to suddenly disconnect during sustained write operations. Independent testers and community members report that after writing roughly 50GB of data sequentially, the affected drives vanish from Device Manager and Disk Management, and in some cases, fail to reappear after a reboot—taking all data written during the transfer window with them.

The Trigger: 50GB Writes and Vanishing Drives

Community testers have zeroed in on a specific, repeatable workload profile that triggers the failure. When a user transfers a large file or set of files—such as a game update via Steam, a video export, or a backup operation—that exceeds approximately 50GB in total, the drive’s utilization climbs above 60%, and the write process abruptly halts. The target SSD disappears from File Explorer, Device Manager, and Disk Management simultaneously. Vendor tools stop reading SMART and controller attributes, indicating a deep controller-level hang rather than a simple driver timeout.

Multiple users, including @Necoru_cat, have demonstrated that the bug is reproducible on affected systems. The failure manifests not as a graceful error message but as a complete device removal from the PCIe/NVMe topology. A reboot sometimes temporarily restores visibility, but in a minority of cases, the drive returns with corrupted metadata or partitions that are missing entirely. The data written during the failure window is often unrecoverable, turning a routine file copy into a data loss event.

This behavior echoes an earlier incident during the Windows 11 24H2 feature rollout in late 2024, when changes to Host Memory Buffer (HMB) allocation caused BSOD loops on certain Western Digital and SanDisk DRAM-less SSDs. That problem was resolved through vendor firmware updates and registry workarounds, but it established a worrying pattern: subtle host-side storage stack modifications can expose latent controller firmware flaws with catastrophic results.

Affected Hardware: Phison Controllers and Beyond

The community has compiled unofficial lists of SSDs that exhibit the disconnect behavior under KB5063878. The common thread is Phison controllers, particularly the PS5012-E12 family. However, the issue is not limited to a single model or firmware revision, and not every drive of a named model will fail. Factors such as exact SKU, firmware version, and motherboard/BIOS configuration appear to modulate the risk.

Drives reported as affected in early aggregations include:
- Corsair Force MP600 (Phison family)
- Phison PS5012-E12 and related Phison-based SKUs
- Kioxia Exceria Plus G4 (Phison-based variants)
- Fikwot FN955 and other third-party Phison-based boards
- SanDisk Extreme Pro M.2 NVMe (some reports)
- Various DRAM-less or HMB-reliant models noted in Japanese community testing

Conversely, several high-end models have not shown widespread problems in sampled lists:
- Samsung 990 PRO / 980 PRO series
- Certain Solidigm and Seagate enterprise NVMe drives
- Some WD and Crucial high-end models (though firmware and variant matter greatly)

These lists are investigative leads, not definitive recall notices. Community reports also mention that some HDDs exhibited similar symptoms, hinting at a possible generic I/O stack regression, but the core failure is NVMe-specific.

Technical Post-Mortem: Why Drives Disappear

Sustained sequential writes stress every layer of the storage subsystem: application buffers, the Windows page cache, the kernel I/O scheduler, the NVMe driver, and the SSD controller’s internal flash translation layer (FTL) and garbage collection routines. A subtle change introduced by KB5063878—possibly in timing, buffer sizing, DMA handling, or HMB negotiation—can push a controller into an edge condition where firmware mishandles a command sequence and locks up.

When the controller stops responding to admin commands, the OS may treat the device as removed from the PCIe bus. This explains the consistent symptom set: unreadable SMART data, disappearance from Device Manager, and corruption of in-flight writes. The controller essentially “goes dark” at the hardware level.

HMB remains a plausible cofactor, especially for DRAM-less SSDs that borrow host RAM for mapping tables and caches. The earlier 24H2 BSOD wave on WD/SanDisk drives was directly caused by an HMB size or policy change. While current reports don’t uniformly pinpoint HMB as the sole root cause this time, it is likely involved in the failure chain for many DRAM-less designs.

An alternative hypothesis gaining traction among enthusiasts points to host PCIe controller or chipset issues. Anecdotal evidence shows that moving an affected drive from an Intel platform (e.g., Z790) to an AMD AM5 system eliminated the disconnect events. Intel’s Raptor Lake processors have a known history of voltage-related instability that required microcode patches, and some users suspect a chipset-level PCIe regression. However, no vendor or Microsoft telemetry has confirmed this. The CPU/chipset theory remains unverified and should be treated with caution.

What Microsoft and Vendors Have Said (So Far)

As of late August 2025, Microsoft has not added the storage regression to the KB5063878 known issues list. The official KB article focuses on the update’s security and quality improvements without acknowledging the drive failure reports. This silence is typical early in a complex compatibility incident; vendor forensics often follow community telemetry.

Independent tech outlets—Igor’s Lab, Tom’s Hardware, Notebookcheck, Guru3D, and others—have reproduced the failure and aggregated affected model lists. SSD manufacturers have yet to issue formal statements directly linking KB5063878 to specific firmware flaws, but the earlier HMB episode demonstrated that vendor firmware updates can resolve such issues. Western Digital, for example, released firmware fixes in response to the 24H2 HMB BSODs, and a similar path is expected now.

Immediate Steps for Users and Administrators

The situation demands conservative action, especially for users whose systems have installed KB5063878 and who rely on NVMe SSDs for critical work. The following checklist synthesizes community recommendations, vendor best practices, and Microsoft servicing mechanics:

Stop heavy writes immediately. If a drive disappears mid-transfer, further writes risk worsening data corruption.
Back up critical data now. Use a separate physical device or cloud storage. Do not rely on the suspect drive for backups.
If a drive becomes inaccessible but data is critical, do not initialize or format it. Power it down and, if possible, create a sector-level forensic image to a safe target. Imaging can preserve recoverable data for diagnostics.
Check SSD vendor utilities. Launch tools like WD Dashboard, Samsung Magician, Crucial Storage Executive, or Corsair Toolbox and verify firmware versions. If a vendor provides an update addressing a known issue, apply it only after backing up.
For managed fleets, consider using WSUS or SCCM to temporarily hold KB5063878 on machines with vulnerable SSD models. Test the update on a representative sample with large-write workloads before broad deployment.
Uninstalling the cumulative update is possible but has security trade-offs. The KB article documents removal mechanics for the combined SSU+LCU package. Balance the risk of data loss against the security exposure before rolling back.
Registry mitigation (short-term, not ideal). During the earlier HMB episode, some users employed the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\StorPort\HmbAllocationPolicy to limit or disable HMB allocation. This reduces performance but may serve as a stopgap for DRAM-less drives. Only use this if you understand the risks and have current backups.
Capture diagnostic logs. Before replacing a failed drive, record SMART attributes (if accessible), kernel and event logs, exact firmware version, BIOS/UEFI version, and the KBs installed. This information is vital for vendor diagnostics and RMA processing.

Longer-Term Remediation Outlook

History suggests that most of these incidents are resolved through vendor firmware updates. Once a manufacturer isolates the controller firmware edge case triggered by the Windows update, a firmware patch can be distributed via the vendor’s tool or through Windows Update itself. Microsoft may also deploy a Known Issue Rollback (KIR) or target specific hardware with a model-specific block until firmware is applied.

If the host platform theory gains validation—i.e., that Intel CPU or chipset PCIe behavior is a contributing factor—solutions could include BIOS updates, CPU microcode patches, or in worst-case scenarios, hardware RMA for degraded silicon. Intel’s prior Raptor Lake instability workstreams show that such remediation is feasible but often complex and drawn-out. For now, however, the primary focus remains on SSD firmware.

Administrators should monitor the Microsoft Release Health dashboard and vendor support pages for official advisories. When a fix arrives, it should be tested in a staging environment that mirrors production workloads before broad rollout.

Critical Appraisal: What We Know and What’s Unproven

The community response to this regression has been swift and technically sharp. By pinning down a reproducible trigger—sustained 50GB+ sequential writes—testers have given engineers a precise test case. This reproducibility is a powerful accelerant for root cause analysis and mitigation.

The primary risk, however, is severe: data loss. Users who do not maintain independent backups risk losing files permanently when a drive disconnects mid-write and returns corrupted metadata. For IT administrators, the update presents a classic dilemma: apply a security patch and risk storage failures, or defer the patch and accept known vulnerabilities.

Several claims in circulation should be treated with skepticism until officially confirmed:
- Intel CPU/chipset as the root cause. While moving a drive to an AMD system has stopped failures for some, this does not prove that the platform is at fault. It may simply change the I/O profile enough to avoid the edge condition. Vendor telemetry is required for a definitive statement.
- Model-level blacklists. The community lists are useful triage aids but are not comprehensive. The same SSD model can behave differently with a different firmware revision, motherboard, or BIOS setting. Do not assume immunity simply because your drive’s model does not appear on a list.

Final Word

KB5063878 has surfaced as a serious compatibility risk for a small but consequential set of storage configurations. The update can cause NVMe SSDs to disappear during heavy writes, with a real potential for data loss. Community testing has provided a clear repro profile and initial model lists, while vendor and Microsoft remediation paths—firmware updates, rollout controls, and temporary workarounds—are the established tools for resolution.

Users and admins must prioritize backups, avoid large sequential writes on suspect systems, keep firmware and BIOS up to date, and await coordinated advisories from SSD vendors and Microsoft before deploying the patch broadly. The fragility of the modern storage choreography—where OS, driver, and firmware must align perfectly—has rarely been more apparent, and the consequences of even a minor slip are immediate and often irreversible.