Zero-Downtime Kubernetes Deployments: A Practical Guide

The problem with "zero-downtime" claims

Almost every Kubernetes tutorial claims zero-downtime deployments are easy — just use a RollingUpdate strategy and you're done. In practice, there are at least five ways you can still drop traffic even with rolling updates configured correctly.

This post covers the full picture: rolling update configuration, pod disruption budgets, readiness probes, preStop hooks, and graceful shutdown handling in your application.

Rolling update configuration

The starting point is your Deployment's update strategy:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0    # never take pods offline before new ones are ready
      maxSurge: 1          # allow one extra pod during rollout

Setting maxUnavailable: 0 is the key — Kubernetes won't terminate old pods until new ones pass their readiness check.

Readiness probes that actually work

A readiness probe that returns 200 too early is worse than no probe at all. Your app needs to be genuinely ready to handle traffic before the probe passes:

readinessProbe:
  httpGet:
    path: /healthz/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3
  successThreshold: 1

The /healthz/ready endpoint should check that your app's dependencies (database connections, caches, config) are warm, not just that the HTTP server started.

PodDisruptionBudgets

Rolling updates only protect you from voluntary disruptions you cause. Node drains during maintenance or cluster upgrades need a PodDisruptionBudget:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2   # or use maxUnavailable: 1
  selector:
    matchLabels:
      app: my-app

Without a PDB, kubectl drain will evict all pods from a node simultaneously regardless of your rolling update settings.

Graceful shutdown with preStop hooks

Kubernetes sends SIGTERM to your pod and immediately removes it from Service endpoints — but there's a race condition. In-flight requests routed to your pod just before the endpoint update propagates will fail.

The fix is a preStop hook that adds a short sleep:

lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]

This keeps the pod alive for 5 seconds after Kubernetes starts removing it from the load balancer, giving existing connections time to drain.

Putting it all together

With all four pieces in place — maxUnavailable: 0, working readiness probes, a PDB, and preStop hooks — you have a deployment process that genuinely drops zero connections during updates.

The next step is validating this with chaos engineering: run a rolling deploy while continuously sending traffic, and measure error rates. Tools like k6 or hey make this straightforward.