Skip to main content
Horizontal Pod Autoscalers (HPAs) automatically scale the number of pod replicas in a Deployment, ReplicaSet, or StatefulSet based on observed metrics like CPU utilization, memory usage, or custom metrics.

Key Concepts

HPA

A controller that automatically adjusts the number of pod replicas based on metrics.

Target

The workload resource (Deployment, ReplicaSet, StatefulSet) that the HPA scales.

Metrics

The measurements (CPU, memory, custom) used to determine scaling decisions.

Replicas

The number of pod instances, bounded by minReplicas and maxReplicas.

Required Permissions

ActionPermission
View HPAsiam:project:infrastructure:kubernetes:read
Create HPAiam:project:infrastructure:kubernetes:write
Edit HPAiam:project:infrastructure:kubernetes:write
Delete HPAiam:project:infrastructure:kubernetes:delete

HPA Status Values

StatusDescription
ActiveHPA is active and current replicas match desired replicas
ScalingUpHPA is scaling up (current < desired replicas)
ScalingDownHPA is scaling down (current > desired replicas)
InactiveHPA is inactive (desired replicas is 0)
ScalingLimitedScaling is limited by min/max replica bounds
UnknownStatus cannot be determined

How to View HPAs

1

Select Cluster

Choose a cluster from the cluster dropdown.
2

Select Namespace

Choose a namespace or select “all” to view HPAs across all namespaces.
3

Filter and Search

Use the search box to find HPAs by name, namespace, or target. Filter by status (Active, Scaling, Inactive).

How to View HPA Details

1

Find the HPA

Locate the HPA in the list.
2

Click HPA Name

Click on the HPA name to open the detail drawer.
3

Review Details

View HPA information including:
  • Overview: Name, namespace, target, status, age
  • Replicas: Current, desired, min, and max replica counts
  • Metrics: Configured metrics and current values
  • Conditions: HPA controller conditions
  • Events: Recent scaling events

How to Create an HPA

1

Click Create HPA

Click the Create HPA button in the page header.
2

Write YAML

Enter the HPA manifest in YAML format. Key fields:
  • spec.scaleTargetRef - Target workload to scale
  • spec.minReplicas - Minimum replica count
  • spec.maxReplicas - Maximum replica count
  • spec.metrics - Metrics to trigger scaling
3

Select Namespace

Choose the target namespace for the HPA.
4

Create

Click Create to apply the manifest.
Ensure the target workload exists and has resource requests defined. HPAs need resource requests to calculate utilization percentages.

How to Edit an HPA

1

Open Actions Menu

Click the actions menu (three dots) on the HPA row.
2

Click Edit YAML

Select Edit YAML to open the YAML editor.
3

Modify Spec

Edit the HPA specification. Common changes:
  • Adjust min/max replicas
  • Change metric thresholds
  • Add or remove metrics
4

Save

Click Update to apply changes.

How to Delete an HPA

1

Open Actions Menu

Click the actions menu on the HPA row.
2

Click Delete

Select Delete from the menu.
3

Confirm

Confirm the deletion. The target workload will stop auto-scaling.
Deleting an HPA stops automatic scaling. The target workload will remain at its current replica count until manually scaled or a new HPA is created.

Metric Types

HPAs support several metric types:
TypeDescriptionExample
ResourceCPU or memory utilizationCPU at 80%
PodsCustom metrics from podsRequests per second
ObjectMetrics from other Kubernetes objectsQueue length
ExternalMetrics from external systemsCloud queue depth

Resource Metrics

metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Custom Metrics

metrics:
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: 1000

Scaling Behavior

HPA v2 supports configuring scaling behavior:
spec:
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
SettingDescription
stabilizationWindowSecondsTime to wait before scaling (prevents flapping)
policiesRules for how quickly to scale

Troubleshooting

  • Verify metrics-server is installed and running
  • Check target pods have resource requests defined
  • Wait for metrics collection (can take a few minutes)
  • Verify metrics API is accessible: kubectl top pods
  • Check current replicas equals maxReplicas (at limit)
  • Verify metric thresholds are being exceeded
  • Check HPA conditions for errors
  • Ensure target workload exists and is not paused
  • Check current replicas equals minReplicas (at minimum)
  • Verify stabilization window has passed
  • Check scale-down policies if configured
  • Review HPA events for scaling decisions
  • Increase stabilizationWindowSeconds
  • Adjust scale-down policies to be more gradual
  • Consider using multiple metrics for better decision making
  • Review and tune metric thresholds
  • Verify the target exists in the same namespace
  • Check scaleTargetRef name and kind are correct
  • Ensure apiVersion matches the target resource
  • Verify Prometheus Adapter or custom metrics API is configured
  • Check metric name matches exactly
  • Ensure metrics are being exported by pods
  • Test with kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

FAQ

HPAs can scale Deployments, ReplicaSets, and StatefulSets. The target must support the /scale subresource. DaemonSets cannot be scaled by HPAs.
HPA checks metrics every 15 seconds by default (configurable via --horizontal-pod-autoscaler-sync-period). Actual scaling depends on stabilization windows and policies.
The target workload stays at its current replica count. Automatic scaling stops until a new HPA is created or you manually scale the workload.
No. Only one HPA should target each workload. Multiple HPAs would conflict with each other’s scaling decisions.
HPA v2 supports multiple metrics, custom metrics, external metrics, and configurable scaling behavior. v1 only supports CPU and basic scaling. Always use v2 (autoscaling/v2).
Yes, for resource metrics (CPU/memory). Custom metrics require additional components like Prometheus Adapter. External metrics require an external metrics provider.
Use the --horizontal-pod-autoscaler-downscale-stabilization flag or configure behavior.scaleDown.stabilizationWindowSeconds to delay scale-down decisions.
HPA cannot calculate utilization percentages without resource requests. Define CPU/memory requests on your containers for HPA to work correctly.