Autoscaling a Cluster

Using cluster autoscaler

Enabling or disabling cluster autoscaling may cause the Kubernetes master to restart, which takes several minutes to complete. In clusters with masters running Kubernetes versions before 1.10.3, any change you make to the cluster autoscaler configuration causes the master to restart. In clusters with masters running Kubernetes versions 1.10.3 and later, once autoscaling is enabled for at least one node pool, further changes to cluster autoscaler configuration don't cause master restart until it is disabled for the last node pool. However, it might still take up to one minute for changes to propagate after the operation completes.

Creating a cluster with autoscaling

The following command creates a cluster of size 3, with node autoscaling based on cluster load that scales the default node pool to a maximum of 5 nodes and a minimum of 1 nodes:

gcloud container clusters create admatic-cluster --num-nodes 3 \
    --enable-autoscaling --min-nodes 1 --max-nodes 5 --zone us-central1-f

In this command:

  • --enable-autoscaling indicates that autoscaling is enabled.
  • --min-nodes specifies the minimum number of nodes for the default node pool.
  • --max-nodes specifies the maximum number of nodes for the default node pool.
  • --zone specifies the [compute zone] in which the autoscaler should create new nodes.
WARNING: Starting in 1.12, new clusters will have basic authentication disabled by default. Basic authentication can be enabled (or disabled) manually using the `--[no-]enable-basic-auth` flag.
WARNING: Starting in 1.12, new clusters will not have a client certificate issued. You can manually enable (or disable) the issuance of the client certificate using the `--[no-]issue-client-certificate` flag.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run `clusters create` with the flag `--metadata disable-legacy-endpoints=true`.
This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
WARNING: Starting in Kubernetes v1.10, new clusters will no longer get compute-rw and storage-ro scopes added to what is specified in --scopes (though the latter will remain included in the default --scopes). To use these scopes, add them explicitly to --scopes. To use the new behavior, set container/new_scopes_behavior property (gcloud config set container/new_scopes_behavior true).
Creating cluster admatic-cluster in us-central1-f...done.
Created [https://container.googleapis.com/v1/projects/espblufi-android/zones/us-central1-f/clusters/admatic-cluster].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-f/admatic-cluster?project=espblufi-android
kubeconfig entry generated for admatic-cluster.
NAME             LOCATION       MASTER_VERSION  MASTER_IP        MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
admatic-cluster  us-central1-f  1.10.9-gke.5    104.154.181.181  n1-standard-1  1.10.9-gke.5  3          RUNNING
kubectl get nodes -o wide
NAME                                             STATUS    ROLES     AGE       VERSION         EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-admatic-cluster-default-pool-6eeaddf8-2r46   Ready     <none>    4h        v1.10.9-gke.5   35.239.172.38   Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-7pv8   Ready     <none>    4h        v1.10.9-gke.5   35.232.9.153    Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-wzz8   Ready     <none>    4h        v1.10.9-gke.5   35.226.19.1     Container-Optimized OS from Google   4.14.65+         docker://17.3.2

Horizontal Pod Autoscaler Walkthrough

Run & expose php-apache server

To demonstrate Horizontal Pod Autoscaler we will use a custom docker image based on the php-apache image. The Dockerfile has the following content:

FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php

It defines an index.php page which performs some CPU intensive computations:

<?php
  $x = 0.0001;
  for ($i = 0; $i <= 1000000; $i++) {
    $x += sqrt($x);
  }
  echo "OK!";
?>

First, we will start a deployment running the image and expose it as a service:

kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80
service "php-apache" created
deployment.apps "php-apache" created

Create Horizontal Pod Autoscaler

Now that the server is running, we will create the autoscaler using kubectl autoscale. The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods controlled by the php-apache deployment we created in the first step of these instructions. Roughly speaking, HPA will increase and decrease the number of replicas (via the deployment) to maintain an average CPU utilization across all Pods of 50% (since each pod requests 200 milli-cores by kubectl run, this means average CPU usage of 100 milli-cores).

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
deployment.apps "php-apache" autoscaled

We may check the current status of autoscaler by running:

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          2m

Please note that the current CPU consumption is 0% as we are not sending any requests to the server (the CURRENT column shows the average across all the pods controlled by the corresponding deployment).

Increase load

Now, we will see how the autoscaler reacts to increased load. We will start a container, and send an infinite loop of queries to the php-apache service:

kubectl run -i --tty load-generator --image=busybox /bin/sh
while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

Within a minute or so, we should see the higher CPU load by executing:

kubectl get hpa
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   104%/50%   1         10        1          3m
kubectl get hpa
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   415%/50%   1         10        3          4m

Here, CPU consumption has increased to 415% of the request. As a result, the deployment was resized to 6 replicas:

kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   6         6         6            6           12m

It may take a few minutes to stabilize the number of replicas. Since the amount of load is not controlled in any way it may happen that the final number of replicas will differ from this example.

kubectl get nodes -o wide
NAME                                             STATUS    ROLES     AGE       VERSION         EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-admatic-cluster-default-pool-6eeaddf8-16m4   Ready     <none>    1m        v1.10.9-gke.5   35.238.43.180   Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-2r46   Ready     <none>    4h        v1.10.9-gke.5   35.239.172.38   Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-7pv8   Ready     <none>    4h        v1.10.9-gke.5   35.232.9.153    Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-9s1c   Ready     <none>    5m        v1.10.9-gke.5   35.238.189.60   Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-wzz8   Ready     <none>    4h        v1.10.9-gke.5   35.226.19.1     Container-Optimized OS from Google   4.14.65+         docker://17.3.2

Cluster has autoscaled to the specified maximum number of 5 nodes.

Stop load

In the terminal where we created the container with busybox image, terminate the load generation with <Ctrl> + C.

Then we will verify the result state (after a minute or so):

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          18m
kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   1         1         1            1           20m

Here CPU utilization dropped to 0%, and so HPA autoscaled the number of replicas back down to 1.

Autoscaling the cluster may take a few minutes.

kubectl get nodes -o wide
NAME                                             STATUS    ROLES     AGE       VERSION         EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-admatic-cluster-default-pool-6eeaddf8-2r46   Ready     <none>    6h        v1.10.9-gke.5   35.239.172.38   Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-7pv8   Ready     <none>    6h        v1.10.9-gke.5   35.232.9.153    Container-Optimized OS from Google   4.14.65+         docker://17.3.2
gke-admatic-cluster-default-pool-6eeaddf8-wzz8   Ready     <none>    6h        v1.10.9-gke.5   35.226.19.1     Container-Optimized OS from Google   4.14.65+         docker://17.3.2

Autoscaling on multiple metrics and custom metrics

results matching ""

    No results matching ""