Horizontal Pod Autoscaling (HPA) ¶
Scenario:
- Install Metrics Server component
- Create Deployment
podinfo
and Servicepodinfo
for stress testing - Create HPA
my-hpa
- Stress Testing
Demo:
Install Metrics Server component ¶
Download yaml file for Metrics Server component
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Replace Google image by Aliyun image image: registry.aliyuncs.com/google_containers/metrics-server:v0.6.1
.
sed -i 's/k8s\.gcr\.io\/metrics-server\/metrics-server\:v0\.6\.1/registry\.aliyuncs\.com\/google_containers\/metrics-server\:v0\.6\.1/g' components.yaml
Change arg
of deployment metrics-server
by adding --kubelet-insecure-tls
to disable tls certificate validation.
vi components.yaml
Updated arg
of metrics-server
is below.
......
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
image: registry.aliyuncs.com/google_containers/metrics-server:v0.6.1
......
Appy the yaml file components.yaml
to deploy metrics-server
.
kubectl apply -f components.yaml
Below resources were crested.
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
Verify if metrics-server
Pod is running as expected (1/1
running)
kubectl get pod -n kube-system -owide | grep metrics-server
Result
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metrics-server-7fd564dc66-sdhdc 1/1 Running 0 61s 10.244.102.15 cka003 <none> <none>
Get current usage of CPU, memory of each node.
kubectl top node
Result:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
cka001 595m 29% 1937Mi 50%
cka002 75m 3% 1081Mi 28%
cka003 79m 3% 1026Mi 26%
Deploy a Service podinfo
¶
Create Deployment podinfo
and Service podinfo
for further stress testing.
kubectl apply -f - << EOF
apiVersion: v1
kind: Service
metadata:
name: podinfo
labels:
app: podinfo
spec:
type: NodePort
ports:
- port: 9898
targetPort: 9898
nodePort: 31198
protocol: TCP
selector:
app: podinfo
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: podinfo
labels:
app: podinfo
spec:
replicas: 2
selector:
matchLabels:
app: podinfo
template:
metadata:
labels:
app: podinfo
spec:
containers:
- name: podinfod
image: stefanprodan/podinfo:0.0.1
imagePullPolicy: Always
command:
- ./podinfo
- -port=9898
- -logtostderr=true
- -v=2
ports:
- containerPort: 9898
protocol: TCP
resources:
requests:
memory: "32Mi"
cpu: "10m"
limits:
memory: "256Mi"
cpu: "100m"
EOF
Config HPA ¶
Create HPA my-hpa
by setting CPU threshold 50%
to trigger auto-scalling with minimal 2
and maximal 10
Replicas.
Use kubectl autoscal
to create HPA my-hpa
.
kubectl autoscale deployment podinfo --cpu-percent=50 --min=1 --max=10
Use autoscaling/v1
version template to crreate HPA my-hpa
.
kubectl apply -f - <<EOF
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: my-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 50
EOF
Use autoscaling/v2
version template to crreate HPA my-hpa
, adding memory resource in the matrics.
kubectl apply -f - <<EOF
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 100Mi
EOF
Get status of HPA.
kubectl get hpa
Result:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-hpa Deployment/podinfo 2%/50% 2 10 2 60s
Memo:
metrics.resource
The values will be averaged together before being compared to the target. 在与目标值比较之前,这些指标值将被平均。metrics.resource.target.type
represents whether the metric type is Utilization, Value, or AverageValuemetrics.resource.target.averageUtilization
is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type. 是跨所有相关 Pod 得出的资源指标均值的目标值。metrics.resource.target.averageValue
(Quantity) is the target value of the average of the metric across all relevant pods (as a quantity). 跨所有 Pod 得出的指标均值的目标值(以数量形式给出)。metrics.resource.target.value
(Quantity) is the target value of the metric (as a quantity). 是指标的目标值(以数量形式给出)。
Info
Stress Testing ¶
Install ab ¶
Here we will use ab
tool to simulate 1000 concurrency.
The ab
command is a command line load testing and benchmarking tool for web servers that allows you to simulate high traffic to a website.
The short definition form apache.org is: The acronym ab
stands for Apache Bench where bench is short for benchmarking.
Execute below command to install ab
tool.
sudo apt install apache2-utils -y
Most common options of ab
are -n
and -c
:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make at a time
-t timelimit Seconds to max. to spend on benchmarking. This implies -n 50000
-p postfile File containing data to POST. Remember also to set -T
-T content-type Content-type header to use for POST/PUT data, eg. 'application/x-www-form-urlencoded'. Default is 'text/plain'
-k Use HTTP KeepAlive feature
Example:
ab -n 1000 -c 100 http://www.baidu.com/
Concurrency Stres Test ¶
Simulate 1000 concurrency request to current node running command ab
. Node port 31198
is the for the service podinfo
.
ab -c 1000 -t 60 http://127.0.0.1:31198/
By command kubectl get hpa -w
we can see that CPU workload has been increasing.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
......
nginx Deployment/podinfo 199%/50% 2 10 10 14m
nginx Deployment/podinfo 934%/50% 2 10 10 14m
nginx Deployment/podinfo 964%/50% 2 10 10 14m
nginx Deployment/podinfo 992%/50% 2 10 10 15m
nginx Deployment/podinfo 728%/50% 2 10 10 15m
nginx Deployment/podinfo 119%/50% 2 10 10 15m
......
And see auto-scalling automically triggered for Deployment podinfo
.
kubectl get pod
kubectl get deployment
Please be noted the scale up is a phased process rather than a sudden event to scale to max. And it'll be scaled down to a balanced status when CPU workload is down.
kubectl get hpa -w
After several hours, we can see below result with above command.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-hpa Deployment/podinfo 2%/50% 2 10 2 60s
Clean up.
kubectl delete service podinfo
kubectl delete deployment podinfo
kubectl delete hpa my-hpa