What is the first command when a container keeps restarting?

Start with docker ps -a and docker logs to identify the exit code and first startup error. Restart policy changes should come after root cause is clear.

Should I set restart: always to fix crash loops?

No. restart: always is a recovery policy, not a fix for startup defects. Resolve command, dependency, or resource errors before enabling aggressive restart behavior.

How do I choose between on-failure and unless-stopped?

Use on-failure for jobs that should retry only on non-zero exits. Use unless-stopped for long-running services that should resume after daemon restart unless you explicitly stop them.

Docker Container Keeps Restarting: Restart Loop Decision Tree Fix (2026)

Published February 28, 2026 · 9 min read

Fix It With a Tool

Validate runtime and compose inputs first

Use Docker validators before restart-policy changes so you isolate startup defects instead of masking them.

Open Compose Validator Open Docker Run Generator

Run this before each redeploy: validate your compose file with Docker Compose Validator, then follow this decision tree for runtime failures. If Compose fails with mapping values are not allowed here, run the mapping values are not allowed YAML recovery path before retrying. Seeing restart-loop logs plus Nginx 502 in the same incident? Use the Docker+Nginx cause taxonomy first. If the container stabilizes but nginx still logs upstream timed out, move to the upstream-timeout fix paths. For adjacent diagnostics, see Docker Container Exits Immediately and Docker Multi-Stage Builds.

Restart loops are expensive because they hide the first failure. This decision tree keeps the flow deterministic: identify the symptom, run one focused check, apply one fix, then verify the restart behavior changed.

Symptom -> Check -> Fix decision tree
Restart policy comparison table
Verification checklist after fix
FAQ

1. Docker container keeps restarting: symptom -> check -> fix tree

Symptom A: container exits in less than 10 seconds, then restarts

Check: docker ps -a --no-trunc and docker logs <container> for the first non-zero exit.

Fix: Correct CMD/ENTRYPOINT path, working directory, or missing startup arguments. Re-run once without restart policy to confirm clean startup.

Symptom B: exit code 137 or OOMKilled in inspect output

Check: docker inspect <container> --format '{{.State.OOMKilled}} {{.State.ExitCode}}'

Fix: Raise memory limit, lower runtime memory footprint, or adjust JVM/Node/Python memory flags to fit container limits.

Symptom C: healthcheck fails and orchestrator keeps recycling container

Check: docker inspect <container> --format '{{json .State.Health}}' and review health endpoint startup timing.

Fix: Increase startup grace period, make healthcheck command lightweight, and avoid depending on downstream services in readiness probes.

Symptom D: app starts only when run manually in shell

Check: docker run --rm -it --entrypoint sh <image:tag> then execute startup command by hand.

Fix: Replace shell-fragile startup scripts with explicit exec-form commands and verify executable permissions in the image.

Warning: Do not hide startup defects by switching to restart: always first. Capture first-failure logs before enabling aggressive restart behavior.

2. Restart policy comparison table (use after root-cause fix)

Policy	Behavior	When to use	Risk if misused
`no`	No automatic restart	Debug sessions and one-off jobs	Service stays down after crash
`on-failure[:N]`	Restart only on non-zero exits	Batch jobs and worker retries	Can mask recurring app defects
`always`	Always restart after stop/daemon restart	Critical services needing max uptime	Crash loops become noisy and expensive
`unless-stopped`	Restart except when manually stopped	Long-running services with explicit operator control	Unexpected restarts after host reboot if not intentionally stopped

3. Verification checklist after applying the fix

Run container once with restart policy disabled.
Confirm startup logs show successful init and no immediate exit.
Apply the chosen restart policy.
Restart Docker daemon or container and confirm expected behavior.
Document the failure signature and fix in your runbook.

FAQ

Why does a restart loop continue after I changed env vars?

Container metadata may still reference old values in compose overrides or stale images. Rebuild or redeploy with explicit env output checks.

Should I debug with compose or raw docker run first?

Start with raw docker run to isolate image behavior, then move back to compose-level networking and dependencies.

Can restart policies fix dependency race conditions?

Only partially. Dependency readiness should be handled with healthchecks, wait strategies, or retry logic in app startup.