Skip to content

Horizontal Pod Autoscaling (HPA)

Scenario:

  • Install Metrics Server component
  • Create Deployment podinfo and Service podinfo for stress testing
  • Create HPA my-hpa
  • Stress Testing

Demo:

Install Metrics Server component

Download yaml file for Metrics Server component

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Replace Google image by Aliyun image image: registry.aliyuncs.com/google_containers/metrics-server:v0.6.1.

sed -i 's/k8s\.gcr\.io\/metrics-server\/metrics-server\:v0\.6\.1/registry\.aliyuncs\.com\/google_containers\/metrics-server\:v0\.6\.1/g' components.yaml

Change arg of deployment metrics-server by adding --kubelet-insecure-tls to disable tls certificate validation.

vi components.yaml

Updated arg of metrics-server is below.

......
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
        image: registry.aliyuncs.com/google_containers/metrics-server:v0.6.1
......

Appy the yaml file components.yaml to deploy metrics-server.

kubectl apply -f components.yaml

Below resources were crested.

serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

Verify if metrics-server Pod is running as expected (1/1 running)

kubectl get pod -n kube-system -owide | grep metrics-server

Result

NAME                                       READY   STATUS    RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
metrics-server-7fd564dc66-sdhdc            1/1     Running   0          61s     10.244.102.15    cka003   <none>           <none>

Get current usage of CPU, memory of each node.

kubectl top node

Result:

NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
cka001   595m         29%    1937Mi          50%       
cka002   75m          3%     1081Mi          28%       
cka003   79m          3%     1026Mi          26% 

Deploy a Service podinfo

Create Deployment podinfo and Service podinfo for further stress testing.

kubectl apply -f - << EOF
apiVersion: v1
kind: Service
metadata:
  name: podinfo
  labels:
    app: podinfo
spec:
  type: NodePort
  ports:
    - port: 9898
      targetPort: 9898
      nodePort: 31198
      protocol: TCP
  selector:
    app: podinfo
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: podinfo
  labels:
    app: podinfo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: podinfo
  template:
    metadata:
      labels:
        app: podinfo
    spec:
      containers:
      - name: podinfod
        image: stefanprodan/podinfo:0.0.1
        imagePullPolicy: Always
        command:
          - ./podinfo
          - -port=9898
          - -logtostderr=true
          - -v=2
        ports:
        - containerPort: 9898
          protocol: TCP
        resources:
          requests:
            memory: "32Mi"
            cpu: "10m"
          limits:
            memory: "256Mi"
            cpu: "100m"
EOF

Config HPA

Create HPA my-hpa by setting CPU threshold 50% to trigger auto-scalling with minimal 2 and maximal 10 Replicas.

Use kubectl autoscal to create HPA my-hpa.

kubectl autoscale deployment podinfo --cpu-percent=50 --min=1 --max=10

Use autoscaling/v1 version template to crreate HPA my-hpa.

kubectl apply -f - <<EOF
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50
EOF

Use autoscaling/v2 version template to crreate HPA my-hpa, adding memory resource in the matrics.

kubectl apply -f - <<EOF
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 100Mi
EOF

Get status of HPA.

kubectl get hpa

Result:

NAME     REFERENCE            TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
my-hpa   Deployment/podinfo   2%/50%    2         10        2          60s

Memo:

  • metrics.resource The values will be averaged together before being compared to the target. 在与目标值比较之前,这些指标值将被平均。
  • metrics.resource.target.type represents whether the metric type is Utilization, Value, or AverageValue
  • metrics.resource.target.averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type. 是跨所有相关 Pod 得出的资源指标均值的目标值。
  • metrics.resource.target.averageValue (Quantity) is the target value of the average of the metric across all relevant pods (as a quantity). 跨所有 Pod 得出的指标均值的目标值(以数量形式给出)。
  • metrics.resource.target.value (Quantity) is the target value of the metric (as a quantity). 是指标的目标值(以数量形式给出)。

Stress Testing

Install ab

Here we will use ab tool to simulate 1000 concurrency.

The ab command is a command line load testing and benchmarking tool for web servers that allows you to simulate high traffic to a website.

The short definition form apache.org is: The acronym ab stands for Apache Bench where bench is short for benchmarking.

Execute below command to install ab tool.

sudo apt install apache2-utils -y

Most common options of ab are -n and -c

-n requests     Number of requests to perform
-c concurrency  Number of multiple requests to make at a time
-t timelimit    Seconds to max. to spend on benchmarking. This implies -n 50000
-p postfile     File containing data to POST. Remember also to set -T      
-T content-type Content-type header to use for POST/PUT data, eg. 'application/x-www-form-urlencoded'. Default is 'text/plain'
-k              Use HTTP KeepAlive feature

Example:

ab -n 1000 -c 100 http://www.baidu.com/

Concurrency Stres Test

Simulate 1000 concurrency request to current node running command ab. Node port 31198 is the for the service podinfo.

ab -c 1000 -t 60 http://127.0.0.1:31198/

By command kubectl get hpa -w we can see that CPU workload has been increasing.

NAME    REFERENCE            TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
......
nginx   Deployment/podinfo   199%/50%    2         10        10         14m
nginx   Deployment/podinfo   934%/50%    2         10        10         14m
nginx   Deployment/podinfo   964%/50%    2         10        10         14m
nginx   Deployment/podinfo   992%/50%    2         10        10         15m
nginx   Deployment/podinfo   728%/50%    2         10        10         15m
nginx   Deployment/podinfo   119%/50%    2         10        10         15m
......

And see auto-scalling automically triggered for Deployment podinfo.

kubectl get pod
kubectl get deployment

Please be noted the scale up is a phased process rather than a sudden event to scale to max. And it'll be scaled down to a balanced status when CPU workload is down.

kubectl get hpa -w

After several hours, we can see below result with above command.

NAME     REFERENCE            TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
my-hpa   Deployment/podinfo   2%/50%    2         10        2          60s

Clean up.

kubectl delete service podinfo
kubectl delete deployment podinfo
kubectl delete hpa my-hpa