Health Checks

1. Introduction

Used to check when a pod has to be restarted (Liveness probe) and if at a particular point of time, a pod should receive traffic or not.

2. Types of probes

  • liveness probe

  • readiness probe

3. Liveness probe

  • Let's create a pod that exposes the endpoint /health that responds with status 200 code meaning success code
hadoop@k8s-00:~$ kubectl create -f https://raw.githubusercontent.com/mhausenblas/kbe/master/specs/healthz/pod.yaml
pod "hc" created
  • the following the a part of the above .yml file
livenessProbe:
  initialDelaySeconds: 2
  periodSeconds: 5
  httpGet:
    path: /health
    port: 9876
  • the above will check /health endpoint every 5 seconds, after waiting for 2 seconds for the first check

  • let's describe the pod

Events:
  Type    Reason                 Age   From               Message
  ----    ------                 ----  ----               -------
  Normal  Scheduled              2m    default-scheduler  Successfully assigned hc to k8s-01
  Normal  SuccessfulMountVolume  2m    kubelet, k8s-01    MountVolume.SetUp succeeded for volume "default-token-csfkh"
  Normal  Pulling                2m    kubelet, k8s-01    pulling image "mhausenblas/simpleservice:0.5.0"
  Normal  Pulled                 2m    kubelet, k8s-01    Successfully pulled image "mhausenblas/simpleservice:0.5.0"
  Normal  Created                2m    kubelet, k8s-01    Created container
  Normal  Started                2m    kubelet, k8s-01    Started container
  • we can find that the pod is healthy because of the below line
  Normal  Started                2m    kubelet, k8s-01    Started container
  • now let's deploy a bad pod which randomly doesn't return 200 status code
hadoop@k8s-00:~$ kubectl create -f https://raw.githubusercontent.com/mhausenblas/kbe/master/specs/healthz/badpod.yaml
pod "badpod" created
Events:
  Type     Reason                 Age   From               Message
  ----     ------                 ----  ----               -------
  Normal   Scheduled              31s   default-scheduler  Successfully assigned badpod to k8s-02
  Normal   SuccessfulMountVolume  31s   kubelet, k8s-02    MountVolume.SetUp succeeded for volume "default-token-csfkh"
  Normal   Pulling                30s   kubelet, k8s-02    pulling image "mhausenblas/simpleservice:0.5.0"
  Normal   Pulled                 10s   kubelet, k8s-02    Successfully pulled image "mhausenblas/simpleservice:0.5.0"
  Normal   Created                10s   kubelet, k8s-02    Created container
  Normal   Started                10s   kubelet, k8s-02    Started container
  Warning  Unhealthy              4s    kubelet, k8s-02    Liveness probe failed: Get http://10.244.179.2:9876/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  • we can find that the health check has failed because of the following line
Warning  Unhealthy              4s    kubelet, k8s-02    Liveness probe failed: Get http://10.244.179.2:9876/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  • we can verify the above failure as follows
hadoop@k8s-00:~$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
badpod    1/1       Running   4          3m
hc        1/1       Running   0          9m
  • there have been 4 restarts in the bad pod, which means that the pod has not been responding with 200 status code (health-check failed)

4. Readiness probe

  • Let’s create a pod with a readinessProbe that kicks in after 10 seconds that announces that it is ready to serve traffic
hadoop@k8s-00:~$ kubectl create -f https://raw.githubusercontent.com/mhausenblas/kbe/master/specs/healthz/ready.yaml
pod "ready" created
  • let's look at the events of the pod to see if the pod is ready to serve traffic
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  default-token-csfkh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-csfkh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason                 Age   From               Message
  ----    ------                 ----  ----               -------
  Normal  Scheduled              1m    default-scheduler  Successfully assigned ready to k8s-01
  Normal  SuccessfulMountVolume  1m    kubelet, k8s-01    MountVolume.SetUp succeeded for volume "default-token-csfkh"
  Normal  Pulled                 1m    kubelet, k8s-01    Container image "mhausenblas/simpleservice:0.5.0" already present on machine
  Normal  Created                1m    kubelet, k8s-01    Created container
  Normal  Started                1m    kubelet, k8s-01    Started container
  • the below status shows us that the pod is ready
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True

results matching ""

    No results matching ""