The server was running, a process vanished. No segfault, no obvious error in the application logs. If the system journal contains the line Out of memory: Kill process — that is the OOM Killer. The Linux kernel decided memory was exhausted and chose a victim by its own algorithm. Here is how to find what was killed, why that specific process was chosen, and how to prevent it.
Step 1: Confirm OOM Killer Is Responsible
Check the kernel journal:
sudo journalctl -k | grep -i "oom\|kill" | tail -30
Or through dmesg with timestamps:
dmesg -T | grep -i "oom\|killed process" | tail -30
Typical output:
[Mar 24 03:17:42] Out of memory: Kill process 14821 (php-fpm) score 847 or sacrifice child
[Mar 24 03:17:42] Killed process 14821 (php-fpm) total-vm:2048000kB, anon-rss:1843200kB, file-rss:4096kB, shmem-rss:0kB
Process name php-fpm, PID 14821, score 847 — the higher the score the more likely to be killed — and memory usage at the time of death.
All OOM events in the last 24 hours:
sudo journalctl -k --since "24 hours ago" | grep "Out of memory"
Step 2: Understand Why That Process Was Chosen
OOM Killer does not kill randomly. Every process gets an oom_score from 0 to 1000 — the higher, the more attractive as a victim. The algorithm considers: memory footprint, process age (newer processes are less valuable), whether it runs as root, and oom_score_adj — a manual adjustment.
Check the current score of any process by PID:
cat /proc/$(pgrep nginx)/oom_score
Check the adjustment — oom_score_adj:
cat /proc/$(pgrep nginx)/oom_score_adj
oom_score_adj range: from -1000 (never kill) to +1000 (kill first).
Top processes by oom_score right now:
ps aux --sort=-%mem | head -10 | awk '{print $2}' | xargs -I{} sh -c 'echo -n "{} $(cat /proc/{}/comm 2>/dev/null): "; cat /proc/{}/oom_score 2>/dev/null' | sort -t: -k2 -rn | head -10
Step 3: See How Much Memory Was Used at the Time of the Kill
Before killing, OOM Killer prints a full memory snapshot to dmesg. Find it:
dmesg -T | grep -A 30 "Out of memory" | head -50
The output contains a table of all processes with their RSS at the moment of the event. This shows who actually consumed all the memory — it is not always the process that was killed.
Protect a Specific Process From Being Killed
Set oom_score_adj = -1000 — the kernel will never kill this process:
echo -1000 | sudo tee /proc/$(pgrep sshd)/oom_score_adj
For systemd services — add to the unit file:
sudo systemctl edit nginx
[Service]
OOMScoreAdjust=-900
Apply:
sudo systemctl daemon-reload
sudo systemctl restart nginx
-900 — very unlikely to be killed. -1000 — guaranteed not to be killed (use only for critical services like sshd).
Make a Process a Preferred Victim
For example a background worker that is not critical — let OOM Killer take it first:
echo 500 | sudo tee /proc/$(pgrep worker)/oom_score_adj
Or in a systemd unit:
[Service]
OOMScoreAdjust=500
System Setting: vm.overcommit_memory
By default Linux allows overcommit — processes can reserve more memory than physically exists, assuming they will not all use it at once. When actual usage exceeds real capacity — OOM Killer arrives.
Check the current mode:
cat /proc/sys/vm/overcommit_memory
Three modes:
0— heuristic overcommit (default)1— allow any overcommit without limits2— disallow overcommit; processes get an error when allocating if memory is unavailable
Mode 2 is the most predictable for production servers. A process fails with ENOMEM when trying to allocate memory rather than being killed unexpectedly.
Set mode 2:
sudo sysctl -w vm.overcommit_memory=2
Make it permanent:
echo "vm.overcommit_memory=2" | sudo tee -a /etc/sysctl.d/99-memory.conf
sudo sysctl -p /etc/sysctl.d/99-memory.conf
With mode 2 also configure vm.overcommit_ratio — what percentage of RAM+swap is allowed:
echo "vm.overcommit_ratio=80" | sudo tee -a /etc/sysctl.d/99-memory.conf
vm.swappiness: Slow Down OOM via Swap
If swap exists but is not being used — the kernel holds data in RAM and triggers OOM earlier than necessary. Lower the threshold for switching to swap:
cat /proc/sys/vm/swappiness
Default is 60. On VPS where swap is slow — 10–20 is better:
sudo sysctl -w vm.swappiness=10
echo "vm.swappiness=10" | sudo tee -a /etc/sysctl.d/99-memory.conf
Limit Memory via cgroup (Don't Let One Service Kill the Server)
Instead of waiting for OOM — limit how much memory a specific service can use. Then OOM Killer only kills workers of that service, leaving everything else alone.
For a systemd service:
sudo systemctl edit php8.1-fpm
[Service]
MemoryMax=512M
MemorySwapMax=0
MemoryMax — hard RAM limit. MemorySwapMax=0 — forbid swap usage. When the limit is exceeded the cgroup triggers a local OOM inside the group and kills only processes belonging to that service.
View current memory consumption by service:
systemd-cgtop -m
Monitoring: Get an Alert Before It Blows Up
Script that sends a warning when available memory drops below a threshold:
#!/bin/bash
THRESHOLD=10
AVAILABLE=$(free | awk '/^Mem:/ {printf "%.0f", $7/$2*100}')
if [ "$AVAILABLE" -lt "$THRESHOLD" ]; then
echo "WARNING: only ${AVAILABLE}% RAM available on $(hostname)" | \
mail -s "Low memory alert" admin@example.com
fi
Add to cron every 5 minutes:
*/5 * * * * /usr/local/bin/memory-check.sh
Quick Reference
| Task | Command |
|---|---|
| Find OOM events | sudo journalctl -k | grep "Out of memory" |
| OOM via dmesg | dmesg -T | grep -i "oom|killed process" |
| Process oom_score | cat /proc/PID/oom_score |
| Protect process from kill | echo -1000 | sudo tee /proc/PID/oom_score_adj |
| Protection via systemd | OOMScoreAdjust=-900 in unit file |
| Overcommit mode | cat /proc/sys/vm/overcommit_memory |
| Disable overcommit | sysctl -w vm.overcommit_memory=2 |
| Limit service RAM | MemoryMax=512M in systemd unit |
| Monitor by service | systemd-cgtop -m |
| Current swappiness | cat /proc/sys/vm/swappiness |