Docker Container Keeps Restarting: Restart Loop Decision Tree Fix (2026)
Fix It With a Tool
Validate runtime and compose inputs first
Use Docker validators before restart-policy changes so you isolate startup defects instead of masking them.
Run this before each redeploy: validate your compose file with Docker Compose Validator, then follow this decision tree for runtime failures. If Compose fails with mapping values are not allowed here, run the mapping values are not allowed YAML recovery path before retrying. Seeing restart-loop logs plus Nginx 502 in the same incident? Use the Docker+Nginx cause taxonomy first. If the container stabilizes but nginx still logs upstream timed out, move to the upstream-timeout fix paths. For adjacent diagnostics, see Docker Container Exits Immediately and Docker Multi-Stage Builds.
Restart loops are expensive because they hide the first failure. This decision tree keeps the flow deterministic: identify the symptom, run one focused check, apply one fix, then verify the restart behavior changed.
Table of contents
1. Docker container keeps restarting: symptom -> check -> fix tree
Symptom A: container exits in less than 10 seconds, then restarts
Check: docker ps -a --no-trunc and docker logs <container> for the first non-zero exit.
Fix: Correct CMD/ENTRYPOINT path, working directory, or missing startup arguments. Re-run once without restart policy to confirm clean startup.
Symptom B: exit code 137 or OOMKilled in inspect output
Check: docker inspect <container> --format '{{.State.OOMKilled}} {{.State.ExitCode}}'
Fix: Raise memory limit, lower runtime memory footprint, or adjust JVM/Node/Python memory flags to fit container limits.
Symptom C: healthcheck fails and orchestrator keeps recycling container
Check: docker inspect <container> --format '{{json .State.Health}}' and review health endpoint startup timing.
Fix: Increase startup grace period, make healthcheck command lightweight, and avoid depending on downstream services in readiness probes.
Symptom D: app starts only when run manually in shell
Check: docker run --rm -it --entrypoint sh <image:tag> then execute startup command by hand.
Fix: Replace shell-fragile startup scripts with explicit exec-form commands and verify executable permissions in the image.
restart: always first. Capture first-failure logs before enabling aggressive restart behavior.
2. Restart policy comparison table (use after root-cause fix)
| Policy | Behavior | When to use | Risk if misused |
|---|---|---|---|
no |
No automatic restart | Debug sessions and one-off jobs | Service stays down after crash |
on-failure[:N] |
Restart only on non-zero exits | Batch jobs and worker retries | Can mask recurring app defects |
always |
Always restart after stop/daemon restart | Critical services needing max uptime | Crash loops become noisy and expensive |
unless-stopped |
Restart except when manually stopped | Long-running services with explicit operator control | Unexpected restarts after host reboot if not intentionally stopped |
3. Verification checklist after applying the fix
- Run container once with restart policy disabled.
- Confirm startup logs show successful init and no immediate exit.
- Apply the chosen restart policy.
- Restart Docker daemon or container and confirm expected behavior.
- Document the failure signature and fix in your runbook.
FAQ
Why does a restart loop continue after I changed env vars?
Container metadata may still reference old values in compose overrides or stale images. Rebuild or redeploy with explicit env output checks.
Should I debug with compose or raw docker run first?
Start with raw docker run to isolate image behavior, then move back to compose-level networking and dependencies.
Can restart policies fix dependency race conditions?
Only partially. Dependency readiness should be handled with healthchecks, wait strategies, or retry logic in app startup.