Kubernetes Autoscaling

Mark Taguiad Apr 14, 2026 · 3 min read

In Kubernetes, scaling is how you adjust your application to handle more or less traffic. There are two main types: horizontal scaling and vertical scaling.

Horizontal Scaling vs Vertical Scaling

Feature	Horizontal Scaling	Vertical Scaling
Method	Add/remove pods	Increase/decrease resources
Tool	HPA	VPA
Best for	Stateless apps	Stateful or legacy apps
Downtime	None	Possible restart
Limit	Cluster size	Node capacity

Horizontal Scaling

This uses the Horizontal Pod Autoscaler to scale number of pods automatically.

You increase the number of pod replicas.
Traffic gets distributed across more pods.

HPA needs resource requests defined, make sure to add resource request and limit.

deploy.yaml

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: web-app
 5spec:
 6  replicas: 2
 7  selector:
 8    matchLabels:
 9      app: web-app
10  template:
11    metadata:
12      labels:
13        app: web-app
14    spec:
15      containers:
16      - name: web
17        image: nginx
18        resources:
19          requests:
20            cpu: "100m"
21            memory: "128Mi"
22          limits:
23            cpu: "500m"
24            memory: "256Mi"

hpa.yaml

 1apiVersion: autoscaling/v2
 2kind: HorizontalPodAutoscaler
 3metadata:
 4  name: web-app-hpa
 5spec:
 6  scaleTargetRef:
 7    apiVersion: apps/v1
 8    kind: Deployment
 9    name: web-app
10  minReplicas: 2
11  maxReplicas: 8
12  metrics:
13  - type: Resource
14    resource:
15      name: cpu
16      target:
17        type: Utilization
18        averageUtilization: 60

Create and verify.

1kubectl get hpa -n web
2NAME          REFERENCE            TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
3web-app-hpa   Deployment/web-app   cpu: 0%/60%   2         8         2          2m7s

CPU is at 0% percent utilization so replicas is still at 2. If it goes beyond 60% then it will scale up (increase replicas).

Vertical Scaling

This means increasing or decreasing resources of a single pod. You give a pod more CPU or RAM instead of adding more pods.

In-place pod resize graduates to stable in Kubernetes 1.35. Let’s get into that, but let’s first demonstraten the immutable version where pod are evicted and recreated when reach the resource limit.

When the resource limit it reached:

calculate new cpu/memory recommendation
evict the running pod
recreate pod with new updated resources

deploy.yaml

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: web-app
 5spec:
 6  replicas: 2
 7  selector:
 8    matchLabels:
 9      app: web-app
10  template:
11    metadata:
12      labels:
13        app: web-app
14    spec:
15      containers:
16      - name: web
17        image: nginx
18        resources:
19          requests:
20            cpu: "100m"
21            memory: "128Mi"
22          limits:
23            cpu: "500m"
24            memory: "256Mi"

vpa.yaml

 1apiVersion: autoscaling.k8s.io/v1
 2kind: VerticalPodAutoscaler
 3metadata:
 4  name:  web-app-vpa
 5spec:
 6  targetRef:
 7    apiVersion: "apps/v1"
 8    kind: Deployment
 9    name: web-app
10  updatePolicy:
11    updateMode: "Auto"

In Kubernetes v1.35 there’s no need to create VPA, just add resizePolicy.

vpa monitor usage
recommendeds better cpu/memory
applies update
no pod eviction and recreate

deploy.yaml

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: web-app
 5spec:
 6  replicas: 2
 7  selector:
 8    matchLabels:
 9      app: web-app
10  template:
11    metadata:
12      labels:
13        app: web-app
14    spec:
15      containers:
16      - name: web
17        image: nginx
18        resources:
19          requests:
20            cpu: "100m"
21            memory: "128Mi"
22          limits:
23            cpu: "500m"
24            memory: "256Mi"
25        resizePolicy:
26        - resourceName: cpu
27          restartPolicy: NotRequired
28        - resourceName: memory
29          restartPolicy: NotRequired

marktaguiad.dev

Kubernetes Autoscaling

Table of Contents

Horizontal Scaling vs Vertical Scaling

Horizontal Scaling

Vertical Scaling