Kubernetes on a single k3s node - press the button and watch the load drive autoscaling, live.
Active app pods
- / 6
CPU vs 70% target
-%
Redis counter
-
Load & autoscaling (last ~90s)CPU %pods70% target
Pressing the button sends many concurrent /burn requests, which makes the app pods work hard.
Watch CPU spike past the 70% line on the chart, and the pod count rise behind it as the autoscaler reacts.
After load stops, the pods scale back to 2 in about 30-60s (this HPA's scale-down is tuned for the demo; Kubernetes defaults to a cautious 5 minutes to avoid flapping). It's one node, so it caps at what the node can hold.
What am I looking at?
Pod - the smallest thing Kubernetes runs: one container, i.e. one running copy of this app. Normally 2 copies share the traffic; under load Kubernetes starts more so the work is spread out.
HorizontalPodAutoscaler (HPA) - a Kubernetes controller that automatically adds or removes pods based on how busy they are. This one watches CPU, targets 70%, and ranges from 2 to 6 pods. It's what makes the count change on its own.
CPU % - how hard the app pods are working, measured against the CPU each pod reserves (its request, 50m = 0.05 of a core here), not against a whole CPU - so it can go over 100%. A pod is allowed to burst up to its limit (250m), which reads as ~500% of its 50m request. When the average crosses 70% (the green dashed line) the HPA adds pods; when it stays low, it removes them.
Redis - a fast in-memory datastore running next to the app (as a second tier). It holds the shared counter and the list of currently-active pods, so every pod sees the same state instead of each keeping its own.
Container runtime - the image is built with Docker, but this node runs it with containerd (k3s's built-in runtime, via Kubernetes' CRI), not the Docker daemon - Kubernetes dropped Docker as a runtime in v1.24. Docker-built (OCI) images run unchanged.
Seeing more than 2 pods before pressing the button? A recent load test is still scaling back down - here that takes about 30-60s (Kubernetes' default is a cautious 5 minutes; this HPA is tuned faster for the demo). Briefly seeing more than 6? During a code deploy Kubernetes runs the old and new pods at once (a zero-downtime rolling update with maxSurge), so the count can momentarily exceed the 6-pod max before settling.