The author; Juan Ignacio Giro…

Editor’s note

Many Kubernetes users, especially those at the enterprise level, quickly encountered the need for automatic environment scaling. Fortunately, the Kubernetes Horizontal Pod Autoscaler (HPA) allows you to configure your deployment toscale horizontally in a variety of ways. One of the biggest advantages of using Kubernetes Autoscaling is that your cluster can track the load capacity of existing pods and calculate whether more are needed.

Kubernetes Autoscaling

By coordinating the built-in two layers of scalability, efficient Kubernetes Autoscaling can be fully exploited:

  1. Pod level automatic scaling: includes Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA); Both extend the resources available to the container.
  2. Automatic scaling at the cluster level: The cluster automatic regulator (CA) manages this scalability plane by scaling the number of nodes in the cluster up or down as necessary.

Details of Kubernetes Autoscaling

Horizontal Pod Autoscaler (HPA)

The HPA scales the number of Pod copies in the cluster. This operation is triggered by the CPU or memory to scale up or down as needed. However, HPA can also be configured to extend Pod based on various external and custom metrics (metrics.k8s. IO, external.metrics.k8s. IO, and custom.metrics.k8s.io).

Vertical Pod Autoscaler (VPA)

Primarily used for stateful services, VPA can add CPU or memory to pods as needed — it also works with stateless pods. To apply these changes, VPA restarts the Pod to update the new CPU and memory resources that can be configured to start in response to an OOM (Out of memory) event. When you restart a Pod, VPA always ensures that the minimum amount is determined according to the Pod allocation budget (PDB), which you can set as a maximum and minimum rate for resource allocation.

Cluster Autoscaler (CA)

The second layer of automatic scaling involves the CA, which automatically resizes the cluster in the following cases:

  • Due to insufficient capacity in the cluster, no Pod/ S can run and enter a pending state (in which case the CA will scale up).
  • The nodes in the cluster are underutilized for some time, and it is possible to migrate the PODS on the nodes (in which case the CA will shrink).

The CA performs routine checks to determine whether any pods are in a pending state, waiting for additional resources, or whether cluster nodes are underutilized. If more resources are required, the number of Cluster nodes is adjusted accordingly. The CA interacts with the cloud provider to request other nodes or to shut down idle nodes and to ensure that the scaled cluster stays within user-set limits. It works with AWS, Azure and GCP.

Five steps for working with HPA and CA with Amazon EKS

This article provides a step-by-step guide to installing and automatically scaling through HPA and CA through Amazon Elastic Container services for Clusters of Kubernetes (Amazon EKS). The following guidelines are two sample test cases:

Cluster requires

  • Amazon VPCS and a security group that meet the EKS clustering requirements.
  • Alternatively, AWS provides a CloudFormation template with VPCS and EKS created to avoid manual VPC creation

CloudFormation YAML files

  • The EKS role applied to the cluster

1. Create an AWS EKS cluster (control panel and working nodes) according to the official guide. Once you have started the working nodes as an Auto Scaling group, they will automatically register with the EKS cluster and you can start deploying the K8S application.

2. Deploy the metric server so that the HPA can automatically scale the number of POD copies based on the CPU/ memory data provided by the API. The metrics. K8s. IO API is typically provided by the Metrics -server, which collects CPU and memory metrics from the Summary API.

3. Apply the following policies to the roles of worker nodes created by EKS

{
   "Version": "2012-10-17"."Statement": [{"Effect": "Allow"."Action": [
            "autoscaling:DescribeAutoScalingGroups"."autoscaling:DescribeAutoScalingInstances"."autoscaling:DescribeLaunchConfigurations"."autoscaling:DescribeTags"."autoscaling:SetDesiredCapacity"."autoscaling:TerminateInstanceInAutoScalingGroup"]."Resource": "*"}}]Copy the code

4. Deploy the K8S CA feature

Depending on the Linux distribution you are using, you may need to update the deployment file and certificate paths. For example, if you are using AMI Linux, you need to replace /etc/ssl/certs.ca-bundle. CRT with /etc/ssl/certs.ca-credential.crt

5. Update the CA deployment YAML file to find the AWS AG (k8s. IO/cluster-autoScaler /< cluster NAME> should contain the actual cluster NAME) tag. Update the AWS_REGION environment variable.

Add the following tag to AWS AG so that the CLUSTER AutoScaler of K8S can automatically recognize AWS AG:

k8s.io/cluster-autoscaler/enabled k8s.io/cluster-autoscaler/

Kubernetes Autoscaling test case #1

Test k8S HPA and K8S CA features at the same time

Requirements:

  • A cluster of running EKS
  • Install the Metric Server
  • The K8S Cluster AutoScaler feature is installed

1. Deploy a test app and create HPA resources for the app deployment.

2. Make requests from different geographical locations to increase the load.

3. The HPA should start scaling the number of pods as the load increases. It will scale according to the hPA resource specified. At some point, a new POD will be in a wait state while waiting for other resources.

$kubectl get nodes -w NAME STATUS ROLES AGE VERSION IP -192-168-189-29.ec2.internal Ready 1h V1.10.3 IP - 192-168-200-20. Ec2. Internal 1 h v1.10.3 ReadyCopy the code
$ kubectl get Pods -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE
ip-192-168-200-20.ec2.internal
php-apache-8699449574-4mg7w 0/1 Pending 0 17m
php-apache-8699449574-64zkm 1/1 Running 0 1h 192.168.210.90 ip-192-168-200-20
php-apache-8699449574-8nqwk 0/1 Pending 0 17m
php-apache-8699449574-cl8lj 1/1 Running 0 27m 192.168.172.71 ip-192-168-189-29
php-apache-8699449574-cpzdn 1/1 Running 0 17m 192.168.219.71 ip-192-168-200-20
php-apache-8699449574-dn9tb 0/1 Pending 0 17m
...
Copy the code

4. CA detects pods in wait state due to insufficient capacity and adjusts the AWS automatic scaling group size. A new node was added:

$kubectl get nodes -w NAME STATUS ROLES AGE VERSION IP -192-168-189-29.ec2.internal Ready 2h v1.10.3 Ip-192-168-200-20. ec2.internal Ready 34s v1.10.3 IP-192-168-92-187. ec2.internal Ready 34s v1.10.3Copy the code

5. HPA can schedule the POD in the waiting state to the new node. The average CPU usage is below the specified target, and there is no need to schedule new pods.

$ kubectl get hpa
NAME         REFERENCE                    TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   40%/50%   2                  25                 20               1h $ kubectl get Pods -o wide -w
Copy the code
$ kubectl get Pods -o wide -w NAME READY STATUS RESTARTS AGE IP NODE php-apache-8699449574-4mg7w 1/1 Running 0 25m 192.168.74.4 IP-192-168-92-187 PHp-apache-8699449574-64zkm 1/1 Running 0 1h 192.168.210.90 IP-192-168-200-20 Php-apache-8699449574-8nqwk 1/1 Running 0 25m 192.168.127.85 IP-192-168-92-187 php-apache-8699449574-cl8lj 1/1 Running 0 25m 192.168.127.85 IP-192-168-92-187 php-8699449574-cl8lj 1/1 Running 0 25m 192.168.127.85 IP-192-168-92-187 35 m 192.168.172.71 IP - 192-168-189-29...Copy the code

6. Shut down several terminals to stop some loads

7. The average CPU utilization decreased, so the HPA started changing the number of POD copies in the deployment and killing some Pods

$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 47%/50% 2 20 7 1h  $ kubectl get Pods -o wide -w NAME READY STATUS RESTARTS AGE IP NODE ... Php-apache-8699449574-v5kwf 1/1 Running 0 36m 192.168.250.0 IP-192-168-200-20 php-apache-8699449574-vl4zj 1/1 Running 0 36m 192.168.242.153 IP-192-168-200-20 php-apache-8699449574-8nqwk 1/1 Terminating 0 26m 192.168.127.85 IP-192-168-92-187 Php-apache-8699449574-dn9tb 1/1 Terminating 0 26m 192.168.124.108 IP-192-168-92-187 php-apache-8699449574-k5ngv 1/1 Terminating 0 26m 192.168.108.58 IP-192-168-92-187Copy the code

8. CA detects that one node is underused and running PODS can be scheduled to other nodes.

$kubectl get nodes NAME STATUS ROLES AGE VERSION IP-192-168-189-29.ec2. internal Ready 2h v1.10.3 Ip-192-168-200-20. ec2. Internal Ready 2H v1.10.3 IP-192-168-92-187. ec2 NAME STATUS ROLES AGE VERSION IP-192-168-189-29.ec2. Internal Ready 2h v1.10.3 IP-192-168-200-20.ec2. Internal Ready 2h v1.10.3Copy the code

9. There should be no obvious timeout in terminal when zooming down

Kubernetes Autoscaling test case #2

Test whether the CA can automatically resize the cluster if there is not enough CPU capacity to schedule the POD

Requirements:

  • A running AWS EKS cluster
  • The Kubernetes CA feature has been installed

1. Create two deployment deployments that require less than 1 vCPU

$ kubectl run nginx --image=nginx:latest --requests=cpu=200m
$ kubectl run nginx2 --image=nginx:latest --requests=cpu=200m
Copy the code

2. Create a new Deployment and request more resources than the remaining CPU

$ kubectl run nginx3 --image=nginx:latest --requests=cpu=1
Copy the code

3. The new POD is in a wait state because no resources are available:

$ kubectl get Pods -w
NAME                      READY     STATUS    RESTARTS   AGE
nginx-5fcb54784c-lcfht    1/1       Running   0          13m
nginx2-66667bf959-2fmlr   1/1       Running   0          3m
nginx3-564b575974-xcm5t   0/1       Pending   0          41s
Copy the code

When describing a POD, you might see an event where there is not enough CPU

$kubectl describe Pod nginx3-564b575974-xcm5t... . ... . Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 32s (x7 over 1m) default-scheduler 0/1 nodes are available: 1 Insufficient cpuCopy the code

4. The CA automatically adjusts the cluster size and adds a new node

$kubectl get nodes NAME STATUS ROLES AGE VERSION IP-192-168-142-174.ec2. internal Ready 1M v1.10.3 << IP - 192-168-82-136. Ec2. Internal 1 h v1.10.3 ReadyCopy the code

5. The cluster now has enough resources to run pods

$ kubectl get Pods
NAME                      READY     STATUS    RESTARTS   AGE
nginx-5fcb54784c-lcfht    1/1       Running   0          48m
nginx2-66667bf959-2fmlr   1/1       Running   0          37m
nginx3-564b575974-xcm5t   1/1       Running   0          35m
Copy the code

6. Two deployments are deleted. After a period of time, the CA detects that a node in the cluster is underutilized, and running pods can be relocated to other existing nodes. AWS AG update, number of nodes reduced by 1.

$kubectl get Nodes NAME STATUS ROLES AGE VERSION IP-192-168-82-136.ec2. internal Ready 1h V1.10.3 $kubectl get Pods -o Wide NAME READY STATUS RESTARTS AGE IP NODE nginx-5FCB54784C-lCFHT 1/1 Running 0 1H 192.168.98.139 IP-192-168-82-136Copy the code

Steps to clean up the environment:

  1. Delete the custom policy added to roles on the Eks Worker node
  2. Follow this guide to delete the cluster

reference

For more information on Kubernetes autoscaling, read Stefan Prodan’s kubernetes Horizontal Pod Autoscaler with Prometheus Custom Metrics.

You can also look at link1, link2, link3.

About ServiceMesher community

The ServiceMesher community was launched in April 2018 by a group of volunteers who share the same values and ideals.

Community focus areas: container, microservice, Service Mesh, Serverless, embrace open source and cloud native, dedicated to promoting the vigorous development of Service Mesh in China.

Community official website: www.servicemesher.com