Hpa-horizontal Pod Autoscaler stands for Pod automatic Horizontal expansion. Automatically increase or decrease the number of Pod replicas by monitoring Pod load.

Literally, it consists of two parts:

  • Monitor Pod load
  • Controls the number of copies of pods

So how does that work? The following is based on 1.17 source code to analyze how HPA works.

Note: The code in this article has been simplified based on the source code: comments, serialization, etc., have been removed, or some core code has been retained and new comments have been added.

resources

HPA’s resource is the HorizontalPodAutoscaler, which in version v1 only supports CPU-based calculations; In v2BEta2, memory – based and custom metrics are added.

v1

//staging/src/k8s.io/api/autoscaling/v1/types.go
type HorizontalPodAutoscaler struct {
	metav1.TypeMeta 
	metav1.ObjectMeta 
	Spec HorizontalPodAutoscalerSpec 
	Status HorizontalPodAutoscalerStatus 
}
type HorizontalPodAutoscalerSpec struct {
	ScaleTargetRef CrossVersionObjectReference // Target resource to monitor
	MinReplicas *int32 // Minimum number of copies
	MaxReplicas int32  // Maximum number of copies
	TargetCPUUtilizationPercentage *int32  // The CPU usage that triggers adjustment
}
Copy the code

v2

//staging/src/k8s.io/api/autoscaling/v2beta2/types.go
type HorizontalPodAutoscaler struct {
	metav1.TypeMeta 
	metav1.ObjectMeta
	Spec HorizontalPodAutoscalerSpec
	Status HorizontalPodAutoscalerStatus 
}
type HorizontalPodAutoscalerSpec struct {
	ScaleTargetRef CrossVersionObjectReference // Target resource to monitor
	MinReplicas *int32 
	MaxReplicas int32
	Metrics []MetricSpec // Add a custom indicator
}
type MetricSpec struct {
	Type MetricSourceType // Indicator source types: Object (based on an Object), Pods (based on the number of Pods), Resource (based on Resource usage, such as CPU in V1), External (based on External indicators). The four methods correspond to the MetricsClient interface
	Object *ObjectMetricSource  // Corresponds to the indicator source of Object type
	Pods *PodsMetricSource // Indicates the indicator source of the Pod type
	Resource *ResourceMetricSource  // Indicates the indicator source of the Resource type
	External *ExternalMetricSource  // Indicates an indicator source of the External type
}
type ObjectMetricSource struct { 
	DescribedObject CrossVersionObjectReference  // Target object
	Target MetricTarget  // Specify the target value, average value, or average usage of the metric
	Metric MetricIdentifier  // Index identifier: name, label selector
}
type PodsMetricSource struct { 
	Metric MetricIdentifier 
	Target MetricTarget 
}
type ResourceMetricSource struct {
	Name v1.ResourceName 
	Target MetricTarget 
}
type ExternalMetricSource struct {
	Metric MetricIdentifier
	Target MetricTarget
}
type MetricTarget struct {
	Type MetricTargetType // Type: Utilization, Value, AverageValue
	Value *resource.Quantity
	AverageValue *resource.Quantity 
	AverageUtilization *int32
}
Copy the code

The controllerHorizontalController

The HorizontalController is added to the Controller Manager with key HorizontalPodAutoscaling. To control the HorizontalPodAutoscaler instance.

///cmd/kube-controller-manager/app/controllermanager.go
func NewControllerInitializers(loopMode ControllerLoopMode) map[string]InitFunc{... controllers["horizontalpodautoscaling"] = startHPAController
    ...
}

Copy the code

Obtaining load Indicators

Since the number of Pod copies is calculated based on the Pod load, there is a way to get the load data, and that way is MetricsClient.

MetricsClient has two implementations: REST and Legacy, which are restMetricsClient and HeapsterMetricsClient respectively. One is a REST implementation to support custom metrics; One is the traditional Heapster metric (which has been deprecated since version 1.13).

//cmd/kube-controller-manager/app/autoscaling.go
func startHPAController(ctx ControllerContext) (http.Handler, bool, error) {
	if! ctx.AvailableResources[schema.GroupVersionResource{Group:"autoscaling", Version: "v1", Resource: "horizontalpodautoscalers"{}]return nil.false.nil
	}

	if ctx.ComponentConfig.HPAController.HorizontalPodAutoscalerUseRESTClients {
		// use the new-style clients if support for custom metrics is enabled
		return startHPAControllerWithRESTClient(ctx)
	}

	return startHPAControllerWithLegacyClient(ctx)
}
Copy the code

Controller logicHorizontalController#Run()

//pkg/controller/podautoscaler/horizontal.go
func (a *HorizontalController) Run(stopCh <-chan struct{}) {
	defer utilruntime.HandleCrash()
	defer a.queue.ShutDown()

	klog.Infof("Starting HPA controller")
	defer klog.Infof("Shutting down HPA controller")

      // Wait for informer to complete the synchronization of HorizontalPodAutoscaler related events
	if! cache.WaitForNamedCacheSync("HPA", stopCh, a.hpaListerSynced, a.podListerSynced) {
		return
	}

	// start a single worker (we may wish to start more in the future)
	// Execute the worker logic until the exit command is received
	go wait.Until(a.worker, time.Second, stopCh)

	<-stopCh
}
Copy the code

The core of a worker is to take a key from a work queue (format: namespace/name) and reconcile the key (the word reconcile is the core of Kubernetes). I prefer “tweaking,” where the state of the instance is adjusted to the desired state. Here, for each event of an instance of hPA, the number of Pod copies of the target instance is adjusted for a specific logic.) .

//pkg/controller/podautoscaler/horizontal.go
func (a *HorizontalController) worker(a) {
	for a.processNextWorkItem() {
	}
	klog.Infof("horizontal pod autoscaler controller worker shutting down")}func (a *HorizontalController) processNextWorkItem(a) bool {
	key, quit := a.queue.Get()
	if quit {
		return false
	}
	defer a.queue.Done(key)

	deleted, err := a.reconcileKey(key.(string))
	iferr ! =nil {
		utilruntime.HandleError(err)
	}
	
	if! deleted { a.queue.AddRateLimited(key) }return true
}
Copy the code

The call stack for reconcile keys: HorizontalController#reconcileKey -> HorizontalController#reconcileAutoscaler -> HorizontalController#computeReplicasForMetrics -> ScaleInterface#Update

Get the HorizontalPodAutoscaler resource instance for the key from the Informer. Then check the Pod load of the target resource and the current number of copies through the information in the HorizontalPodAutoscaler instance to get the desired number of Pod copies. Finally, the Scale API is used to adjust the number of copies of Pod. The reason for the adjustment, the result of the calculation, and so on are written to the Condition of the HorizontalPodAutoscaler instance.

Calculate the desired number of copies

For each metric, the recommended number of copies is computed, and the largest one is the final expected number of copies.

//pkg/controller/podautoscaler/horizontal.go
func (a *HorizontalController) computeReplicasForMetrics(hpa *autoscalingv2.HorizontalPodAutoscaler, scale *autoscalingv1.Scale, metricSpecs []autoscalingv2.MetricSpec) (replicas int32, metric string, statuses []autoscalingv2.MetricStatus, timestamp time.Time, err error){...for i, metricSpec := range metricSpecs {
		replicaCountProposal, metricNameProposal, timestampProposal, condition, err := a.computeReplicasForMetric(hpa, metricSpec, specReplicas, statusReplicas, selector, &statuses[i])

		iferr ! =nil {
			if invalidMetricsCount <= 0 {
				invalidMetricCondition = condition
				invalidMetricError = err
			}
			invalidMetricsCount++
		}
		if err == nil && (replicas == 0 || replicaCountProposal > replicas) {
			timestamp = timestampProposal
			replicas = replicaCountProposal
			metric = metricNameProposal
		}
	}
    ......
}
Copy the code

# computeStatusForObjectMetric (note that this method less a “s”) using MetricsClient get the value of the specified index.

The details of this process can be further delved into, but this is enough to understand how the HPA is implemented.

The article is uniformly published in the public number cloud native refers to north.