Zero-Downtime Deploys & Rollbacks
Every deployment on Stackpad is a zero-downtime deployment. If a new version fails its health check, Stackpad automatically rolls back to the previous working version. This page explains exactly how that works.
Zero-downtime deployments
When a new version is deployed, Stackpad follows a blue-green deployment strategy:
- The new container starts alongside the old one
- Stackpad runs a health check against the new container
- If the health check passes, Caddy switches traffic to the new container
- The old container is stopped after traffic drains
At no point are both containers stopped — your users always hit a running version.
Health checks in detail
Health checks are how Stackpad decides whether a new deployment is working. The behavior depends on the service type:
Web services (HTTP check)
Stackpad sends a GET request to the service’s configured port (e.g. port 3000). The check passes if:
- The port is accepting connections AND
- The response status code is 2xx or 3xx
The check hits your application root — it doesn’t use a special /health endpoint. Any valid HTTP response means the service is healthy.
Non-web services (TCP check)
For databases, caches, and background services, Stackpad verifies that the configured port is accepting TCP connections. No HTTP request is sent.
Timeout
Health checks have a 90-second timeout. Stackpad retries the check during this window. If the service doesn’t respond within 90 seconds, the deployment is marked as failed.
Automatic rollbacks
If a health check fails, Stackpad automatically:
- Keeps the old container running — your users keep seeing the previous version
- Stops the new container — the failed version is removed
- Marks the deployment as “Failed” — visible in the dashboard
- Preserves build logs — you can view what went wrong
No manual intervention is needed. Your users never see an error page.
When deployments fail
A deployment can fail at several stages. Here’s how to diagnose each:
Build failure
The code failed to compile or the Docker image couldn’t be built.
How to debug:
- Go to the Deployments tab on your service or project
- Click the failed deployment
- Read the build logs — they show the full output of the build process
Common causes:
- TypeScript compilation errors
- Missing dependencies
- Incorrect build command
- Build exceeds the 10-minute timeout (large monorepos, slow installs)
Health check failure
The build succeeded but the application didn’t start correctly.
How to debug:
- Check the build logs — the build itself succeeded, so look for runtime errors
- Check the service Logs tab — if the container started briefly, it may have logged errors before crashing
- Common causes:
- Missing environment variable (e.g.
DATABASE_URLnot set) - Port mismatch (app listens on 8080 but service is configured for 3000)
- Application crash on startup (unhandled exception)
- Startup takes longer than 90 seconds
- Missing environment variable (e.g.
Deploy failure
Rare — the image was built but couldn’t be pulled or started on the compute node.
How to debug: Check the deployment status in the dashboard. If this happens repeatedly, it’s likely an infrastructure issue — contact support.
Manual rollbacks
To roll back to a previous version:
- Go to the Deployments tab on your service
- Find a previous successful deployment (status: Ready)
- Click Redeploy to restore that version
The redeploy goes through the same zero-downtime process — the old version is health-checked and traffic is switched only after it’s confirmed working.
Deployment status reference
| Status | Meaning | What to do |
|---|---|---|
| Queued | Waiting for a build slot | Wait — max 4 concurrent builds |
| Building | Cloning repo, installing deps, building | Wait — 10 min timeout |
| Deploying | Starting container, running health check | Wait — 90 sec timeout |
| Ready | Live and serving traffic | Nothing — it’s working |
| Failed | Build or health check failed | Check build logs and service logs |
| Stopped | Manually stopped | Redeploy to restart |
What’s next?
- Git push deploy — understand the full deployment pipeline
- Logging — view build and runtime logs
- Troubleshooting — common issues and how to fix them