Kubernetes OOMKilled:
Root Cause, Fix, and Prevention
OOMKilled (exit code 137) means your container hit its memory limit and the kernel killed it. Here is how to confirm the diagnosis, find the root cause, and stop it from recurring.
What OOMKilled actually means
Kubernetes enforces memory limits via Linux cgroups. When a container's memory usage exceeds resources.limits.memory, the kernel's OOM (Out Of Memory) killer fires. It sends SIGKILL (signal 9) to the container process — there is no graceful shutdown, no SIGTERM, no cleanup.
The resulting exit code is 137 (128 + signal 9). If Kubernetes is configured to restart the container, it enters CrashLoopBackOff if it keeps getting killed.
kubectl describe pod <pod-name> -n <namespace>
# Look for:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137Why containers get OOMKilled
Memory limit set too low
The most common cause. The limit was set at deployment time based on a guess, and actual peak usage exceeds it — especially under load.
Measure real usage first. Deploy to staging with no limits, run realistic load, check peak usage with kubectl top pod, then set limits 20-30% above peak.
Traffic spike beyond designed capacity
The app works fine at normal load but OOMKills during traffic spikes. Memory usage scales with concurrent requests and your limits don't account for peaks.
Set limits based on maximum expected load, not average. Consider horizontal pod autoscaling so new pods absorb traffic before existing ones hit limits.
Memory leak in application code
Memory usage grows steadily over time and never stabilizes. The garbage collector cannot reclaim objects because references are held. Raising the limit only delays the next kill.
Profile the application to find the leak. In JVM apps, take a heap dump. In Go apps, use pprof. In Node.js, use --inspect and Chrome DevTools. Fix the leak in code.
JVM heap + native memory exceeding limit
Java and Scala apps have both heap (-Xmx) and native memory. If -Xmx is close to the container limit, native memory pushes total usage over the limit even if the heap looks fine.
Set -Xmx to 75% of the memory limit to leave headroom for native memory, JVM metaspace, and thread stacks.
Sidecar or init container consuming memory
In multi-container pods, all containers share the pod's memory limit. A logging sidecar or monitoring agent using memory contributes to the total.
Check kubectl top pod --containers to see per-container usage. Set separate limits per container that account for all sidecars.
How to measure real memory usage
# Per pod
kubectl top pod <name> -n <namespace>
# Per container within pod
kubectl top pod <name> -n <namespace> --containers# 1. Run without limits under max load
# 2. Record peak usage
# 3. Set limits:
memory request = avg_steady_state * 1.0
memory limit = peak_usage * 1.25
# Example: peak = 400Mi
# request: 256Mi, limit: 512Mi# Alert when container uses >80% of its memory limit
container_memory_working_set_bytes{container!=""}
/
container_spec_memory_limit_bytes{container!=""}
> 0.80Diagnosing OOMKilled with ActivLayer
ActivLayer correlates the OOMKill event with your metrics history, identifies whether it was a one-time spike or a steady leak, and proposes a specific fix — including the exact memory limit value to set.
Correlating with traffic and deploy events...
Pattern: Memory grows linearly — 8MB/hour increase.
This is a memory leak, not a limit misconfiguration.
Correlation: Leak started after deploy at 14:23 on Apr 24.
Commit: 3e4f891 — added in-memory session cache without TTL.
Recommended actions (in order):
1. Increase limit to 1Gi as temporary relief
2. Revert to pre-Apr-24 version or add TTL to session cache
[apply temporary limit increase] [see revert options] [dismiss]
Frequently asked questions
Get root cause and a specific fix in seconds
Free Community tier — connect your cluster in 5 minutes.