StatefulSet

StatefulSet is the workload API object used to manage stateful applications.

Creating a StatefulSet

cat << EOF > web.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
EOF

kubectl create -f web.yaml
service "nginx" created
statefulset.apps "web" created

The command above creates two Pods, each running an NGINX webserver.

kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
web-0     1/1       Running   0          1m
web-1     1/1       Running   0          48s

kubectl get service nginx
NAME      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
nginx     ClusterIP   None         <none>        80/TCP    2m

kubectl get statefulset
NAME      DESIRED   CURRENT   AGE
web       2         2         2m

Ordered Pod Creation

For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.

kubectl get pods -w -l app=nginx
NAME      READY     STATUS    RESTARTS   AGE
web-0     0/1       Pending   0          0s
web-0     0/1       Pending   0         0s
web-0     0/1       ContainerCreating   0         0s
web-0     1/1       Running   0         19s
web-1     0/1       Pending   0         0s
web-1     0/1       Pending   0         0s
web-1     0/1       ContainerCreating   0         0s
web-1     1/1       Running   0         18s

Notice that the web-1 Pod is not launched until the web-0 Pod is "Running and Ready".

Pods in a StatefulSet

Examining the Pod’s Ordinal Index

Pods in a StatefulSet have a sticky, unique identity. This identity is based on a unique ordinal index that is assigned to each Pod by the StatefulSet controller.

kubectl get pods -l app=nginx
NAME      READY     STATUS    RESTARTS   AGE
web-0     1/1       Running   0          6m
web-1     1/1       Running   0          5m

The Pod's names take the form <statefulset name>-<ordinal index>. Since the web StatefulSet has two replicas, it creates two Pods, web-0 and web-1.

Using Stable Network Identities

Each Pod has a stable hostname based on its ordinal index.

for i in 0 1; do kubectl exec web-$i -- sh -c 'hostname'; done
web-0
web-1

Using nslookup on the Pod's hostnames, you can examine their in-cluster DNS addresses.

kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh
If you don't see a command prompt, try pressing enter.

nslookup web-0.nginx
Server:    100.64.0.10
Address 1: 100.64.0.10 kube-dns.kube-system.svc.cluster.local

Name:      web-0.nginx
Address 1: 100.96.1.3 web-0.nginx.default.svc.cluster.local

nslookup web-1.nginx
Server:    100.64.0.10
Address 1: 100.64.0.10 kube-dns.kube-system.svc.cluster.local

Name:      web-1.nginx
Address 1: 100.96.2.4 web-1.nginx.default.svc.cluster.local

The CNAME of the headless service points to SRV records (one for each Pod that is Running and Ready). The SRV records point to A record entries that contain the Pods’ IP addresses.

In one terminal, watch the StatefulSet’s Pods.

kubectl get pod -w -l app=nginx

In a second terminal, delete all the Pods in the StatefulSet.

kubectl delete pod -l app=nginx
pod "web-0" deleted
pod "web-1" deleted

Wait for the StatefulSet to restart them, and for both Pods to transition to "Running and Ready".

for i in 0 1; do kubectl exec web-$i -- sh -c 'hostname'; done
web-0
web-1

kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh
If you don't see a command prompt, try pressing enter.

nslookup web-0.nginx
Server:    100.64.0.10
Address 1: 100.64.0.10 kube-dns.kube-system.svc.cluster.local

Name:      web-0.nginx
Address 1: 100.96.1.5 web-0.nginx.default.svc.cluster.local

nslookup web-1.nginx
Server:    100.64.0.10
Address 1: 100.64.0.10 kube-dns.kube-system.svc.cluster.local

Name:      web-1.nginx
Address 1: 100.96.2.5 web-1.nginx.default.svc.cluster.local

The Pods’ ordinals, hostnames, SRV records, and A record names have not changed, but the IP addresses associated with the Pods may have changed. In the cluster used for this tutorial, they have. This is why it is important not to configure other applications to connect to Pods in a StatefulSet by IP address.

Writing to Stable Storage

kubectl get pvc
NAME        STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound     pvc-f10d556e-74b3-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            16m
www-web-1   Bound     pvc-0a9a0129-74b4-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            16m

The StatefulSet controller created two PersistentVolumeClaims that are bound to two PersistentVolumes. As the cluster used in this tutorial is configured to dynamically provision PersistentVolumes, the PersistentVolumes were created and bound automatically.

The NGINX webservers, by default, will serve an index file at /usr/share/nginx/html/index.html. The volumeMounts field in the StatefulSets spec ensures that the /usr/share/nginx/html directory is backed by a PersistentVolume.

Write the Pod's hostnames to their index.html files and verify that the NGINX webservers serve the hostnames.

for i in 0 1; do kubectl exec web-$i -- sh -c 'echo $(hostname) > /usr/share/nginx/html/index.html'; done

for i in 0 1; do kubectl exec -it web-$i -- curl localhost; done
web-0
web-1

In one terminal, watch the StatefulSet’s Pods.

kubectl get pod -w -l app=nginx

In a second terminal, delete all of the StatefulSet’s Pods.

kubectl delete pod -l app=nginx
pod "web-0" deleted
pod "web-1" deleted

In the first terminal, wait for all of the Pods to transition to "Running and Ready".

Verify the web servers continue to serve their hostnames.

for i in 0 1; do kubectl exec -it web-$i -- curl localhost; done
web-0
web-1

Even though web-0 and web-1 were rescheduled, they continue to serve their hostnames because the PersistentVolumes associated with their PersistentVolumeClaims are remounted to their volumeMounts. No matter what node web-0 and web-1 are scheduled on, their PersistentVolumes will be mounted to the appropriate mount points.

Scaling a StatefulSet

Scaling Up

kubectl scale sts web --replicas=5
statefulset.apps "web" scaled

The StatefulSet controller scaled the number of replicas. As with StatefulSet creation, the StatefulSet controller created each Pod sequentially with respect to its ordinal index, and it waited for each Pod’s predecessor to be "Running and Ready" before launching the subsequent Pod.

Scaling Down

kubectl patch sts web -p '{"spec":{"replicas":3}}'
statefulset.apps "web" patched

Ordered Pod Termination

The controller deleted one Pod at a time, in reverse order with respect to its ordinal index, and it waited for each to be completely shutdown before deleting the next.

kubectl get pvc -l app=nginx
NAME        STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound     pvc-f10d556e-74b3-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            27m
www-web-1   Bound     pvc-0a9a0129-74b4-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            26m
www-web-2   Bound     pvc-2e0366a4-74b7-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            3m
www-web-3   Bound     pvc-43950326-74b7-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            3m
www-web-4   Bound     pvc-9296990c-74b7-11e8-a3fe-025d0d2462b2   1Gi        RWO            gp2            1m

There are still five PersistentVolumeClaims and five PersistentVolumes. When exploring a Pod’s stable storage, we saw that the PersistentVolumes mounted to the Pods of a StatefulSet are not deleted when the StatefulSet’s Pods are deleted. This is still true when Pod deletion is caused by scaling the StatefulSet down.

Updating StatefulSets (TODO)

There are two valid update strategies, RollingUpdate and OnDelete.

Rolling Update

The RollingUpdate update strategy will update all Pods in a StatefulSet, in reverse ordinal order, while respecting the StatefulSet guarantees.

kubectl patch statefulset web -p '{"spec":{"updateStrategy":{"type":"RollingUpdate"}}}'
statefulset.apps "web" not patched

On Delete

The OnDelete update strategy implements the legacy (1.6 and prior) behavior, When you select this update strategy, the StatefulSet controller will not automatically update Pods when a modification is made to the StatefulSet’s .spec.template field.

This strategy can be selected by setting the .spec.template.updateStrategy.type to OnDelete.

Deleting StatefulSets

StatefulSet supports both Non-Cascading and Cascading deletion. In a Non-Cascading Delete, the StatefulSet’s Pods are not deleted when the StatefulSet is deleted. In a Cascading Delete, both the StatefulSet and its Pods are deleted.

Non-Cascading Delete

kubectl delete statefulset web --cascade=false
statefulset.apps "web" deleted
kubectl get pods -l app=nginx
NAME      READY     STATUS    RESTARTS   AGE
web-0     1/1       Running   0          12m
web-1     1/1       Running   0          12m
web-2     1/1       Running   0          10m

Even though web has been deleted, all of the Pods are still "Running and Ready".

Cascading Delete

Recreate the StatefulSet

kubectl create -f web.yaml
Error from server (AlreadyExists): error when creating "web.yaml": services "nginx" already exists

Ignore the error. It only indicates that an attempt was made to create the nginx Headless Service even though that Service already exists.

Delete the StatefulSet again. This time, omit the --cascade=false parameter.

kubectl delete statefulset web
statefulset.apps "web" deleted

kubectl get pods
No resources found.

Note that, while a cascading delete will delete the StatefulSet and its Pods, it will not delete the Headless Service associated with the StatefulSet. You must delete the nginx Service manually.

kubectl delete service nginx
service "nginx" deleted

Pod Management Policy

For some distributed systems, the StatefulSet ordering guarantees are unnecessary and/or undesirable. These systems require only uniqueness and identity.

OrderedReady Pod Management

OrderedReady pod management is the default for StatefulSets. It tells the StatefulSet controller to respect the ordering guarantees demonstrated above.

Parallel Pod Management

Parallel pod management tells the StatefulSet controller to launch or terminate all Pods in parallel, and not to wait for Pods to become Running and Ready or completely terminated prior to launching or terminating another Pod.

cat << EOF > webp.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  podManagementPolicy: "Parallel"
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
EOF

In one terminal, watch the Pods in the StatefulSet.

kubectl get po -l app=nginx -w

In another terminal, create the StatefulSet and Service in the manifest.

kubectl create -f webp.yaml
service "nginx" created
statefulset.apps "web" created

Examine the output of the kubectl get command that you executed in the first terminal.

NAME      READY     STATUS    RESTARTS   AGE
web-0     0/1       Pending   0          0s
web-0     0/1       Pending   0         0s
web-1     0/1       Pending   0         0s
web-1     0/1       Pending   0         0s
web-0     0/1       ContainerCreating   0         0s
web-1     0/1       ContainerCreating   0         0s
web-0     1/1       Running   0         18s
web-1     1/1       Running   0         19s

The StatefulSet controller launched both web-0 and web-1 at the same time.

Keep the second terminal open, and, in another terminal window scale the StatefulSet.

kubectl scale statefulset/web --replicas=4
statefulset.apps "web" scaled

Examine the output of the terminal where the kubectl get command is running.

web-2     0/1       Pending   0         0s
web-3     0/1       Pending   0         0s
web-2     0/1       Pending   0         0s
web-3     0/1       Pending   0         0s
web-2     0/1       ContainerCreating   0         0s
web-3     0/1       ContainerCreating   0         0s
web-2     1/1       Running   0         18s
web-3     1/1       Running   0         18s

The StatefulSet controller launched two new Pods, and it did not wait for the first to become Running and Ready prior to launching the second.

Keep this terminal open, and in another terminal delete the web StatefulSet.

kubectl delete sts web
statefulset.apps "web" deleted

Again, examine the output of the kubectl get command running in the other terminal.

web-3     1/1       Terminating   0         1m
web-2     1/1       Terminating   0         1m
web-1     1/1       Terminating   0         3m
web-0     1/1       Terminating   0         3m
web-0     0/1       Terminating   0         3m
web-2     0/1       Terminating   0         1m
web-1     0/1       Terminating   0         3m
web-3     0/1       Terminating   0         1m
web-2     0/1       Terminating   0         1m
web-2     0/1       Terminating   0         1m
web-0     0/1       Terminating   0         3m
web-0     0/1       Terminating   0         3m
web-1     0/1       Terminating   0         3m
web-1     0/1       Terminating   0         3m
web-3     0/1       Terminating   0         1m
web-3     0/1       Terminating   0         1m

The StatefulSet controller deletes all Pods concurrently, it does not wait for a Pod’s ordinal successor to terminate prior to deleting that Pod.

Close the terminal where the kubectl get command is running and delete the nginx Service.

kubectl delete svc nginx
service "nginx" deleted

Cleaning up

You will need to delete the persistent storage media for the PersistentVolumes used in this tutorial. Follow the necessary steps, based on your environment, storage configuration, and provisioning method, to ensure that all storage is reclaimed.

results matching ""

    No results matching ""