Kubernetes CrashLoopBackOff:
What it means and how to fix it fast
Most engineers spend 20–30 minutes on a CrashLoopBackOff running the same 4 kubectl commands and Googling error messages. Here is a systematic approach to cut that to under 2 minutes.
What CrashLoopBackOff actually means
A container in CrashLoopBackOff is stuck in a loop: it starts, crashes, Kubernetes tries to restart it, it crashes again. Kubernetes applies exponential backoff between restarts to avoid hammering a broken container: 10s, 20s, 40s, 80s, 160s, then 300s (5 minutes) indefinitely.
The status CrashLoopBackOff means Kubernetes has detected the loop and is applying backoff. It does not mean the fix is to delete and recreate the pod — the underlying issue will persist.
Backoff timing
The 4-command loop every SRE knows
The standard debugging sequence for CrashLoopBackOff — in the order you should run them:
kubectl get pods -n <namespace>kubectl describe pod <pod-name> -n <namespace>kubectl logs <pod-name> -n <namespace> --previouskubectl logs <pod-name> -n <namespace> -c <container>After these four commands, you cross-reference the output with Grafana, check Slack history for recent deploys, and Google the specific error. Average time: 20–40 minutes.
Decode the exit code first
The exit code in kubectl describe pod narrows the diagnosis before you read a single log line. Find it under Last State > Exit Code.
The 7 most common CrashLoopBackOff causes
Missing environment variable
The most common cause. App starts, tries to read DB_URL or API_KEY, finds nil, and crashes. Fix: verify all required env vars are set in the deployment spec or referenced secret.
kubectl describe pod <name> | grep -A5 EnvironmentOOMKilled — memory limit too low
Exit code 137. The app's peak memory usage exceeds its limit. Fix: run without limits briefly, measure peak usage with kubectl top pod, set limits 20% above peak.
kubectl describe pod <name> | grep -A3 "Last State"Liveness probe too aggressive
The liveness probe fires before the app is ready, fails, kills the container, and the cycle repeats. Fix: add initialDelaySeconds to give the app time to start. Typical Java apps need 45–90s.
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 10Dependency not ready at startup
App starts before its database or upstream service is available. Fix: add an init container that waits for the dependency, or implement retry logic in the app itself.
initContainers:
- name: wait-for-db
image: busybox
command: ['sh', '-c', 'until nc -z db 5432; do sleep 2; done']Bad image tag
Image pulled successfully but is the wrong version — incompatible binary or missing files. Fix: check the image tag in the deployment spec, verify the registry has the correct image, use digest pinning for critical deployments.
kubectl describe pod <name> | grep ImageVolume mount or permission issue
App tries to write to a mounted volume but doesn't have permission. Fix: check the securityContext, ensure the volume is mounted at the right path, and that the owning UID matches.
kubectl exec -it <pod> -- ls -la /dataApplication bug (exit code 1)
The app has a runtime exception — null pointer, type error, unhandled panic. Fix: read the crash logs carefully. The last few lines before exit almost always contain the stack trace.
kubectl logs <name> --previous | tail -50Diagnosing CrashLoopBackOff with ActivLayer
Instead of running four kubectl commands and cross-referencing multiple sources, you describe the problem in plain English:
Reading previous container logs...
Checking resource limits and usage history...
Root cause: OOMKilled. The container hit its 256Mi memory limit
during peak order processing (Black Friday traffic spike).
Peak usage: 312Mi. Limit: 256Mi.
Proposed fix: Increase memory limit to 512Mi
kubectl patch deployment payment-service -p ...
[dry-run preview] [approve and apply] [dismiss]
ActivLayer reads the pod logs, event history, and resource metrics automatically, identifies the root cause, and generates a specific remediation command. You approve the dry-run preview and the fix applies — all within your terminal.
Frequently asked questions
Diagnose your next CrashLoopBackOff in under 15 seconds
Connect your cluster in 5 minutes. Free tier, no credit card, no demo call.