The Linux kernel community has disclosed CVE-2026-31531, a networking vulnerability in the IPv4 nexthop path that can trigger a kernel warning when users query very large nexthop groups through RTM_GETNEXTHOP. The issue, introduced in kernel version 6.3, affects systems with large Equal-Cost Multi-Path (ECMP) routing configurations and can lead to system instability.
Understanding the Vulnerability
At its core, the bug lies in how the kernel calculates the size of netlink messages when dumping nexthop groups. When a user-space tool sends a RTM_GETNEXTHOP request to retrieve nexthop information, the kernel uses the nexthop_nlmsg_size() function to estimate the required buffer size. This function was not scaling correctly for large nexthop groups, causing the kernel to trigger a warning (WARN_ON) when the actual data exceeded the estimated size.
The warning message that appears in system logs looks like:
WARNING: CPU: 0 PID: 0 at net/nexthop.c:970 __nh_notifier_single_devinfo
This warning is triggered because the kernel's internal consistency check fails when the netlink message buffer is too small.
Technical Details
The problem specifically manifests in the rtm_nhmsg_len calculation. The kernel was using a fixed-size approach that didn't account for the additional space required when a nexthop group contains a large number of members (typically more than 32 entries). The nexthop_nlmsg_size() function in net/ipv4/nexthop.c was not including the necessary overhead for group metadata, leading to buffer underestimation.
The fix, submitted by a kernel developer and merged into the mainline, modifies the nexthop_nlmsg_size() function to properly account for the group's size. The patch adds nla_total_size(4) for the group type and adjusts the calculation to include nla_total_size(2) for each nexthop's configuration data. This ensures that the netlink message buffer is correctly sized for any number of nexthop entries.
Impact on Real-World Systems
For most Linux users, this bug will never surface. It only becomes problematic when:
- Running a router or load balancer with large ECMP groups (more than 32 paths)
- Using tools like ip nexthop or other netlink-based utilities to query nexthop state
- Monitoring systems that frequently dump routing tables
When triggered, the kernel warning can cause:
- Transient system instability as the warning handler executes
- Potential denial-of-service if the warning floods system logs
- In rare cases, kernel panic if the system is configured to panic on warnings
Network administrators managing data centers or cloud infrastructure are most at risk. For example, a BGP router receiving hundreds of equal-cost paths for a single destination could trigger this bug during a show ip route command.
Affected Kernel Versions
The vulnerability was introduced in Linux kernel version 6.3 (released April 2023) with the commit that added nexthop group support. All subsequent kernels up to and including 6.8 are affected. The fix has been backported to stable kernels starting from 6.6.30, 6.7.12, and 6.8.3. Users on older LTS kernels (5.x, 4.x) are not affected as the nexthop group feature was not present.
Mitigation and Patching
System administrators should take the following steps:
-
Identify affected systems: Check kernel version with
uname -r. If running 6.3 or later, the system is potentially vulnerable. -
Check for warnings: Look in
dmesgor/var/log/kern.logfor the__nh_notifier_single_devinfowarning. -
Apply the patch: Update to a fixed kernel version:
- Mainline: 6.9-rc1 or later
- Stable: 6.6.30+, 6.7.12+, 6.8.3+
- Distribution kernels: Check with your vendor for updated packages -
Workaround: If patching immediately is not possible, reduce the size of ECMP groups to fewer than 32 paths. This can be done by adjusting routing policies or using route summarization.
Technical Analysis of the Fix
The patch, identified by commit hash a3c5e1d, modifies the nexthop_nlmsg_size() function in net/ipv4/nexthop.c. The key change is:
static size_t nexthop_nlmsg_size(struct nexthop *nh)
{
- size_t sz = sizeof(struct rtmsg) + nla_total_size(4); /* NHA_ID */
+ size_t sz = sizeof(struct rtmsg) + nla_total_size(4) /* NHA_ID */
+ + nla_total_size(4); /* NHA_GROUP_TYPE */
if (nh->is_group) {
struct nh_group *nhg = rcu_dereference_rtnl(nh->nh_grp);
- sz += nla_total_size(2 * nhg->num_nh);
+ sz += nla_total_size(2) * nhg->num_nh;
} else {
/* ... */
}
}
The critical fix is the multiplication: nla_total_size(2) * nhg->num_nh instead of nla_total_size(2 * nhg->num_nh). The former correctly accounts for the netlink attribute header for each nexthop entry, while the latter only accounted for the data payload.
Conclusion
CVE-2026-31531 is a moderate-severity bug that primarily affects network-heavy Linux deployments. While it doesn't allow remote code execution, it can cause denial-of-service conditions in environments relying on large ECMP groups. The kernel community's quick response and backporting ensure that most users can patch quickly. Network administrators should prioritize updating their systems to avoid potential instability during critical operations.