Case Study: Health Check ¶

Scenario:

Create Deployment and Service
Simulate an error (delete index.html)
Pod is in unhealth status and is removed from endpoint list
Fix the error (revert the index.html)
Pod is back to normal and in endpoint list

Create Deployment and Service ¶

Create Deployment nginx-healthcheck and Service nginx-healthcheck.

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-healthcheck
spec:
  replicas: 2
  selector:
    matchLabels:
      name: nginx-healthcheck
  template:
    metadata:
      labels:
        name: nginx-healthcheck
    spec:
      containers:
        - name: nginx-healthcheck
          image: nginx:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 80  
          livenessProbe:
            initialDelaySeconds: 5
            periodSeconds: 5
            tcpSocket:
              port: 80
            timeoutSeconds: 5   
          readinessProbe:
            httpGet:
              path: /
              port: 80
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-healthcheck
spec:
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
  type: NodePort
  selector:
    name: nginx-healthcheck
EOF

Check Pod nginx-healthcheck.

kubectl get pod -owide

Result

NAME                                 READY   STATUS    RESTARTS   AGE   IP              NODE     NOMINATED NODE   READINESS GATES
nginx-healthcheck-79fc55d944-jw887   1/1     Running   0          9s    10.244.102.14   cka003   <none>           <none>
nginx-healthcheck-79fc55d944-nwwjc   1/1     Running   0          9s    10.244.112.13   cka002   <none>           <none>

Access Pod IP via curl command, e.g., above example.

curl 10.244.102.14
curl 10.244.112.13

We see a successful index.html content of Nginx below with above example.

Check details of Service craeted in above example.

kubectl describe svc nginx-healthcheck

We will see below output. There are two Pods information listed in Endpoints.

Name:                     nginx-healthcheck
Namespace:                dev
Labels:                   <none>
Annotations:              <none>
Selector:                 name=nginx-healthcheck
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       11.244.238.20
IPs:                      11.244.238.20
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  31795/TCP
Endpoints:                10.244.102.14:80,10.244.112.13:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

We can also get information of Endpoints.

kubectl get endpoints nginx-healthcheck

Result

NAME                ENDPOINTS                           AGE
nginx-healthcheck   10.244.102.14:80,10.244.112.13:80   72s

Till now, two nginx-healthcheck Pods are working and providing service as expected.

Simulate readinessProbe Failure ¶

Let's simulate an error by deleting and index.html file in on of nginx-healthcheck Pod and see what's readinessProbe will do.

First, execute kubectl exec -it <your_pod_name> -- bash to log into nginx-healthcheck Pod, and delete the index.html file.

kubectl exec -it nginx-healthcheck-79fc55d944-jw887 -- bash
cd /usr/share/nginx/html/
rm -rf index.html
exit

After that, let's check the status of above Pod that index.html file was deleted.

kubectl describe pod nginx-healthcheck-79fc55d944-jw887

We can now see Readiness probe failed error event message.

......
Events:
  Type     Reason     Age              From               Message
  ----     ------     ----             ----               -------
  Normal   Scheduled  2m8s             default-scheduler  Successfully assigned dev/nginx-healthcheck-79fc55d944-jw887 to cka003
  Normal   Pulled     2m7s             kubelet            Container image "nginx:latest" already present on machine
  Normal   Created    2m7s             kubelet            Created container nginx-healthcheck
  Normal   Started    2m7s             kubelet            Started container nginx-healthcheck
  Warning  Unhealthy  2s (x2 over 7s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 403

Let's check another Pod.

kubectl describe pod nginx-healthcheck-79fc55d944-nwwjc

There is no error info.

......
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m46s  default-scheduler  Successfully assigned dev/nginx-healthcheck-79fc55d944-nwwjc to cka002
  Normal  Pulled     3m45s  kubelet            Container image "nginx:latest" already present on machine
  Normal  Created    3m45s  kubelet            Created container nginx-healthcheck
  Normal  Started    3m45s  kubelet            Started container nginx-healthcheck

Now, access Pod IP via curl command and see what the result of each Pod.

curl 10.244.102.14
curl 10.244.112.13

Result:

curl 10.244.102.14 failed with 403 Forbidden error below.
curl 10.244.112.13 works well.

Let's check current status of Nginx Service after one of Pods runs into failure.

kubectl describe svc nginx-healthcheck

In below output, there is only one Pod information listed in Endpoint.

Name:                     nginx-healthcheck
Namespace:                dev
Labels:                   <none>
Annotations:              <none>
Selector:                 name=nginx-healthcheck
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       11.244.238.20
IPs:                      11.244.238.20
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  31795/TCP
Endpoints:                10.244.112.13:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Same result we can get by checking information of Endpoints, which is only Pod is running.

kubectl get endpoints nginx-healthcheck

Output:

NAME                ENDPOINTS          AGE
nginx-healthcheck   10.244.112.13:80   6m5s

Fix readinessProbe Failure ¶

Let's re-create the index.html file again in the Pod.

kubectl exec -it nginx-healthcheck-79fc55d944-jw887 -- bash

cd /usr/share/nginx/html/

cat > index.html << EOF 
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
EOF

exit

We now can see that two Pods are back to Endpoints to provide service now.

kubectl describe svc nginx-healthcheck
kubectl get endpoints nginx-healthcheck

Re-access Pod IP via curl command and we can see both are back to normal status.

curl 10.244.102.14
curl 10.244.112.13

Verify the Pod status again.

kubectl describe pod nginx-healthcheck-79fc55d944-jw887

Conclusion:

By delete the index.html file, the Pod is in unhealth status and is removed from endpoint list.
One one health Pod can provide normal service.

Clean up

kubectl delete service nginx-healthcheck
kubectl delete deployment nginx-healthcheck

Simulate livenessProbe Failure ¶

Re-create Deployment nginx-healthcheck and Service nginx-healthcheck.

Deployment:

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
nginx-healthcheck        0/2     2            0           7s

Pods:

NAME                                      READY   STATUS    RESTARTS   AGE
nginx-healthcheck-79fc55d944-lknp9        1/1     Running   0          96s
nginx-healthcheck-79fc55d944-wntmg        1/1     Running   0          96s

Change nginx default listening port from 80 to 90 to simulate livenessProbe Failure. livenessProbe check the live status via port 80.

kubectl exec -it nginx-healthcheck-79fc55d944-lknp9 -- bash
root@nginx-healthcheck-79fc55d944-lknp9:/# cd /etc/nginx/conf.d
root@nginx-healthcheck-79fc55d944-lknp9:/etc/nginx/conf.d# sed -i 's/80/90/g' default.conf
root@nginx-healthcheck-79fc55d944-lknp9:/etc/nginx/conf.d# nginx -s reload
2022/07/24 12:59:45 [notice] 79#79: signal process started

The Pod runs into failure.

kubectl describe pod nginx-healthcheck-79fc55d944-lknp9

We can see livenessProbe failed error event message.

Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  17m                    default-scheduler  Successfully assigned dev/nginx-healthcheck-79fc55d944-lknp9 to cka003
  Normal   Pulled     2m47s (x2 over 17m)    kubelet            Container image "nginx:latest" already present on machine
  Normal   Created    2m47s (x2 over 17m)    kubelet            Created container nginx-healthcheck
  Normal   Started    2m47s (x2 over 17m)    kubelet            Started container nginx-healthcheck
  Warning  Unhealthy  2m47s (x4 over 2m57s)  kubelet            Readiness probe failed: Get "http://10.244.102.46:80/": dial tcp 10.244.102.46:80: connect: connection refused
  Warning  Unhealthy  2m47s (x3 over 2m57s)  kubelet            Liveness probe failed: dial tcp 10.244.102.46:80: connect: connection refused
  Normal   Killing    2m47s                  kubelet            Container nginx-healthcheck failed liveness probe, will be restarted

Once failure detected by livenessProbe, the container will restarted again automatically. The default.conf we modified will be replaced by default file and the container status is up and normal.