Last week, Canonical disclosed CVE-2026-31431, nicknamed “Copy Fail” — a HIGH severity (CVSS 7.8) local privilege escalation in the Linux kernel. Any unprivileged local user on an affected system can gain full root access. If you run Ubuntu servers, this one matters.
I manage 41 Ubuntu servers across versions 20.04, 22.04, and 24.04. I spent an evening auditing every single one, found one actively exposed, applied the mitigation, and ran a forensic compromise check. This post documents the entire process.
What Is CVE-2026-31431 “Copy Fail”?
The vulnerability lives in algif_aead — a Linux kernel module that exposes AEAD (Authenticated Encryption with Associated Data) cryptographic operations to userspace via AF_ALG sockets. Think AES-GCM, AES-CCM, ChaCha20-Poly1305 — the ciphers that secure TLS, WireGuard, and disk encryption.
How the Exploit Works
The attack chains three kernel components together:
- AF_ALG Socket: An unprivileged user opens a kernel crypto socket bound to
authencesn(hmac(sha256),cbc(aes))and performs AEAD decryption. - Splice Mechanism: The exploit uses
splice()to feed page cache pages from readable system files (like/usr/bin/su) directly into the crypto subsystem’s scatterlist — without copying the data. - Scratch Space Abuse: During decryption,
authencesnperforms a 4-byte write past the AEAD tag into adjacent page cache pages, crossing into the target binary’s cached memory.
The attacker constructs multiple sendmsg/splice pairs, each writing 4 bytes of shellcode into the in-memory image of a system binary, then executes it. Root gained.
Why It’s Stealthy
This is what makes “Copy Fail” particularly nasty: the on-disk file is never modified. The exploit corrupts only the kernel’s page cache — the in-memory representation of the file. The dirty page is never written back to disk. Traditional file integrity tools checking on-disk checksums will find nothing wrong.
Which Ubuntu Versions Are Affected?
| Ubuntu Version | Status |
|---|---|
| 20.04 LTS (Focal) | ✅ Affected |
| 22.04 LTS (Jammy) | ✅ Affected |
| 24.04 LTS (Noble) | ✅ Affected |
| 26.04 (Resolute) | ❌ Not affected |
All three Ubuntu LTS versions I run in production are affected. The patched kernel from Canonical is rolling out — until it arrives, the recommended interim fix is to disable the vulnerable module.
Auditing 41 Servers in Under 3 Minutes
The first question: is the algif_aead module even loaded? It only loads on-demand — if nothing has ever opened an AF_ALG socket requesting AEAD operations, the server isn’t exposed even though the vulnerability exists in the kernel.
# Check if the vulnerable module is loaded
grep -qE '^algif_aead ' /proc/modules && echo "STATUS: LOADED (VULNERABLE)" || echo "STATUS: NOT LOADED (safe)"
# Check for active AF_ALG sockets
ss -a | grep alg
# Check if anything is using it
lsof 2>/dev/null | grep algif
# Check if OpenSSL is configured to use AF_ALG engine
openssl engine -v af_alg 2>/dev/null && echo "AF_ALG engine present" || echo "No AF_ALG engine"
I ran this across all 41 servers simultaneously using parallel SSH subshells:
for IP in "${SERVERS[@]}"; do
(
RESULT=$(ssh root@$IP 'grep -qE "^algif_aead " /proc/modules && echo "LOADED" || echo "safe"' 2>&1)
echo "$IP: $RESULT"
) &
done
wait
Results: 40 Safe, 1 Exposed
| Result | Count |
|---|---|
| NOT LOADED (safe) | 40 servers |
| LOADED (vulnerable) | 1 server |
One server came back exposed — a busy Ubuntu 24.04 machine running dozens of Docker containers and CI/CD runners. The module was loaded but — critically — zero active AF_ALG sockets were open and no processes were using it. The attack surface existed, but it hadn’t been touched.
Workloads That Actually Use algif_aead
Before disabling anything, it’s worth understanding what legitimately depends on this module. The answer for most web stacks: very little.
| Workload | Risk | Why |
|---|---|---|
| IPSec / StrongSwan | High | Heavy AEAD user via kernel crypto |
| WireGuard | Medium | May use kernel crypto path |
| OpenSSL with AF_ALG engine | Medium | Only if engine explicitly configured |
| OpenVPN | Low | Uses userspace OpenSSL, not AF_ALG sockets |
| Nginx / Apache with TLS | Low | Only if AF_ALG engine enabled in OpenSSL |
| Standard web apps | None | No direct AF_ALG usage |
I also run several OpenVPN servers. All were completely safe — OpenVPN handles its crypto entirely in userspace via OpenSSL and never touches AF_ALG sockets. Verified by checking /proc/modules, ss -a | grep alg, and the OpenSSL engine on each one. All clean.
Applying the Mitigation
Until the patched kernel ships, disable algif_aead entirely. No reboot required.
Step 1 — Block it from loading on boot
echo "install algif_aead /bin/false" | sudo tee /etc/modprobe.d/disable-algif_aead.conf
This tells modprobe: whenever anything requests algif_aead, run /bin/false instead. It exits with failure every time, so the module never loads — even if another module or the kernel itself tries to pull it in.
Step 2 — Unload from the running kernel
sudo rmmod algif_aead 2>/dev/null
Step 3 — Verify
grep -qE '^algif_aead ' /proc/modules \
&& echo "Module is STILL loaded" \
|| echo "Module is NOT loaded — safe"
Checking for Compromise
Because “Copy Fail” only corrupts the page cache — never disk — standard file integrity tools won’t catch it. A reboot clears the page cache. But if an attacker already gained root and established persistence before the reboot, that survives. So I checked for persistence artifacts.
Page Cache Integrity Check
Compare the in-memory hash of critical binaries against their on-disk hash. A mismatch is the smoking gun:
for bin in /usr/bin/su /usr/bin/sudo /bin/bash /usr/bin/passwd; do
disk_md5=$(md5sum "$bin" | awk '{print $1}')
mem_md5=$(cat "$bin" | md5sum | awk '{print $1}')
if [ "$disk_md5" = "$mem_md5" ]; then
echo "$bin: OK"
else
echo "$bin: MISMATCH — POSSIBLE COMPROMISE"
fi
done
Persistence Indicators
# Only root should have UID 0
awk -F: '$3==0 {print}' /etc/passwd
# Check for unauthorized SSH keys
cat /root/.ssh/authorized_keys
# New systemd services created recently
find /etc/systemd /lib/systemd -name "*.service" -newer /etc/hostname -ls
# Suspicious files dropped in temp directories
find /tmp /dev/shm /var/tmp -type f -newer /proc/1/exe -ls
# Who is currently logged in and recent login history
who && last | head -20
My Results
| Check | Result |
|---|---|
| Page cache integrity (su, sudo, bash, passwd) | ✅ All match disk |
| UID 0 accounts | ✅ Only root |
| Unauthorized SSH keys | ✅ All known team keys |
| New systemd services | ✅ All pre-existing |
| Active AF_ALG sockets (ever opened) | ✅ None |
| Suspicious temp files | ✅ Explained (eBPF artifacts from a CI/CD pipeline) |
Verdict: Not compromised. The module was loaded opportunistically but never actively exploited.
Clearing the Page Cache Fleet-Wide
As a final step, I flushed the page cache on all 41 servers. This ensures any potential in-memory corruption — even undetected — is wiped clean. Files will be re-read from disk on next access. It’s safe, causes no data loss, and takes milliseconds:
sync; echo 3 > /proc/sys/vm/drop_caches
41/41 confirmed OK.
Summary
| Action | Status |
|---|---|
| Audited all 41 servers for module exposure | ✅ Done |
| Identified 1 exposed server | ✅ Done |
| Verified VPN servers are safe | ✅ Done |
| Applied interim mitigation on exposed server | ✅ Done |
| Forensic compromise check | ✅ Clean |
| Flushed page cache fleet-wide | ✅ Done |
| Patched kernel | ⏳ Pending — auto-update scheduled |
Quick Reference — All Commands
# 1. Check if module is loaded
grep -qE '^algif_aead ' /proc/modules && echo "VULNERABLE" || echo "safe"
# 2. Block from loading on boot
echo "install algif_aead /bin/false" | sudo tee /etc/modprobe.d/disable-algif_aead.conf
# 3. Unload from running kernel
sudo rmmod algif_aead 2>/dev/null
# 4. Verify it's gone
grep -qE '^algif_aead ' /proc/modules && echo "Still loaded" || echo "Safe"
# 5. Flush page cache (safe, no data loss)
sync; echo 3 > /proc/sys/vm/drop_caches
# 6. Permanent fix — apply patched kernel then reboot
sudo apt update && sudo apt install --only-upgrade linux-image-generic
References: Ubuntu Security Blog · Ubuntu CVE Tracker · xint.io Technical Analysis
If you are running these servers on AWS EC2, you may also need to expand disk capacity as part of routine maintenance — see the guide on how to resize an AWS EBS volume without downtime.
If you are running these servers on AWS EC2, you may also need to expand disk capacity as part of routine maintenance — see the guide on how to resize an AWS EBS volume without downtime.

