The author | we source, Mr. | alibaba cloud native public number
What are the current trends in the Serverless container industry? What are the applications? If Kubernetes sky grows on the cloud, how should its architecture be designed? What infrastructure is required for the Serverless container? Ali Cloud Container Service Product Leader Yi Li and Ali Cloud Serverless Kubernetes product TL Zhang Wei will share their key thinking on Serverless container architecture and behind.
From Serverless container to Serverless Kubernetes
Serverless containers are products and technologies that allow users to deploy container applications without purchasing and managing servers.
Serverless container can greatly improve the agility and flexibility of container application deployment and reduce user computing costs. Users can focus on service applications instead of underlying infrastructure management, greatly improving application development efficiency and reducing o&M costs.
Kubernetes has become the industry’s de facto standard for container choreography systems, The cloud native application ecosystem based on Kubernetes (Helm, Istio, Knative, Kubeflow, Spark on K8s, etc.) makes Kubernetes a cloud operating system. On the one hand, the management complexity of K8s is fundamentally solved through Serverless, so that users do not need to be trapped in K8s cluster capacity planning, security maintenance and fault diagnosis; On the one hand, it further unleashes the power of cloud computing, enabling infrastructure to meet the needs of security, availability and scalability, which can form differentiated competitiveness.
1. Industry trends
Gartner predicts that by 2023, 70% of AI tasks will be built using computational models such as containers, Serverless, etc. In AWS research, 40% of new USERS of ECS (AWS Resilient Container Service) adopted the Serverless Container format of ECS on Fargate in 2019.
Serverless Container is one of the evolution directions of the existing Container as a Service. It is complementary to fPaaS/FaaS (Function as a Service). FaaS provides event-driven programming where the user only needs to implement the processing logic of the function, such as transcoding and watermarking the video when a video uploaded by the user is received. FaaS is efficient and flexible, but requires users to change their existing development model to adapt. The carrier of Serverless Container application is Container image, which is flexible and supports various types of applications, such as stateless applications, stateful applications, and computing task applications, together with the scheduling system. A large number of existing applications can be deployed in the Serverless Container environment without modification.
Figure source
Gartner also mentioned in the report that the Serverless container industry standard is not yet set, and cloud vendors have a lot of room to provide unique value-added capabilities through technological innovation. Its recommendations for cloud vendors are as follows:
-
Expand Serverless container application scenarios and combinations, and migrate more workload of common containers to Serverless container services.
-
Promote standardization of Serverless containers to alleviate user concerns about cloud vendor lock-in.
2. Typical scenarios and application value
Since Aliyun ASK/ECI’s official public beta in May 2018, we are very happy to see the value of Serverless containers gradually recognized by users. Typical application scenarios include:
1) Elastic capacity expansion of online services
Based on ASK, it supports elastic capacity expansion of online services, and can rapidly expand 500 application instances within 30 seconds, easily coping with unexpected and expected burst traffic. For example, during the epidemic, several online education platforms easily faced business peaks with ASK/ECI’s strong resilience.
2) O&M free Serverless AI platform
The intelligent, operation-free AI application platform developed based on ASK allows developers to create their own algorithmic model development environment, while the platform scales flexibly on demand, greatly reducing the complexity of system maintenance and capacity planning.
3) Serverless big data computing
Build Serverless big data computing platform based on ASK. Data computing applications, such as Serverless Spark and Presto, flexibly meet the requirements of various computing tasks, high flexibility, strong isolation, and maintenance-free requirements of different service departments in the rapid growth of enterprises.
Serverless container architecture thinking
Unlike standard K8s, Serverless K8s is deeply integrated with IaaS infrastructure, enabling public cloud vendors to improve scale, efficiency and capability through technological innovation. At the architectural level, we divide the Serverless container into two layers: container orchestration and computing resource pool. Below, we will take an in-depth look at these two layers and share our key thoughts on the Serverless container architecture and behind them.
1. How to Lose weight
Kubernetes’ success in container choreography is not just due to Google’s halo and CNCF’s efforts. Behind it is the precipitation and sublimation of large-scale distributed resource scheduling and automated operation and maintenance in Google Borg. A few technical points:
1) Declarative apis
Because Kubernetes uses a declarative API. Developers can focus on the application itself, not the details of system implementation. Different resource types, such as Deployment, StatefulSet, and Job, provide an abstraction of different types of workloads. For Kubernetes, the “level-triggered” implementation based on the declarative API provides a more robust distributed system implementation than the “edge-triggered” approach.
2) Scalable architecture
All K8s components are based on a consistent, open API implementation and interaction. Third party developers can also provide domain-specific extension implementation through methods such as CRD (Custom Resource Definition) /Operator, which greatly improves the capability of K8s.
3) Portability
Through a series of abstractions such as Loadbalance Service, Ingress, CNI and CSI, K8s helps business applications to shield implementation differences of underlying infrastructure and migrate flexibly.
2. Design principles of Serverless Kubernetes
Serverless Kubernetes must be compatible with the Kubernetes ecosystem, provide the core values of K8s, and be deeply integrated with cloud capabilities.
-
Users can directly use Kubernetes declarative API, compatible with Kubernetes application definition, Deployment, StatefulSet, Job, Service and so on without modification.
-
It is important to be fully compatible with Kubernetes’ extension mechanism so that Serverless Kubernetes can support more workloads. In addition, Serverless K8s components also strictly follow the control mode of K8s state approximation.
-
The capabilities of Kubernetes are implemented as fully as possible using the capabilities of the cloud, such as resource scheduling, load balancing, service discovery, etc. Fundamentally simplify the design of container platform, improve the scale, reduce the complexity of user operation and maintenance. At the same time, these implementations should be transparent to users to ensure portability, so that users’ existing applications can be deployed smoothly on Serverless K8s, and users’ applications should be deployed in a mixture of traditional and Serverless containers.
3. From Node Centric to Nodeless
Traditional Kubernetes adopts node-centered architecture design: node is the operation carrier of Pod, Kubernetes scheduler selects the appropriate node in the working node pool to run Pod, and uses Kubelet to complete the life cycle management and automatic operation and maintenance of Pod. If the node pool resources are insufficient, you need to expand the node pool and container application capacity.
One of the most important concepts for Serverless Kubernetes is to decouple the runtime of the container from the specific node runtime environment. In this way, users do not need to pay attention to the operation and security of nodes, reducing the operation and maintenance costs. And greatly simplifies the container elastic implementation, no need to plan according to the capacity, on-demand container application Pod can be created; In addition, the Serverless container runtime can be supported by the entire cloud elastic computing infrastructure, protecting the overall cost and scale of resiliency.
When we launched the Serverless Kubernetes project at the end of 2017, we were thinking about how Kubernetes architecture would be designed if it grew on the cloud. We have extended and optimized the existing Kubernetes design implementation. Cloud Scale’s Nodeless K8s architecture was built — internally code-named Viking, because ancient Viking ships were known for their speed and ease of handling.
1) the Scheduler
The main function of the traditional K8s scheduler is to select an appropriate node from a batch of nodes to schedule Pod, meeting various constraints such as resource and affinity. Since there is no node concept in the Serverless K8s scenario, resources are limited to the underlying elastic computing inventory, and we only need to retain some basic AZ affinity and other concepts to support. In this way, scheduler’s work is greatly simplified and execution efficiency is greatly improved. In addition, we customized and extended the Scheduler, which can perform more orchestration optimization on the Serverless workload, and fully reduce the computing cost on the premise of ensuring the availability of the application.
2) Scalability
The scalability of K8s is affected by many factors, one of which is the number of nodes. To ensure Kubernetes compatibility, AWS EKS on Fargate adopts Pod and Node 1:1 model (one virtual Node runs one Pod), which severely limits the scalability of the cluster. Currently, a single cluster can support up to 1000 pods. In our opinion, this option is not suitable for large-scale application scenarios. In ASK, while maintaining Kubernetes compatibility, we solve the problem that cluster size is limited by Node. A single cluster can easily support 10K Pod. In addition, there are many factors that affect the scalability of a traditional K8s cluster, such as Kube-Proxy deployed on nodes. With clusterIP support, any single endpoint change can cause a cluster-wide change storm. Serverless K8s also uses some innovative methods to limit the scope of change propagation in these areas, which we will continue to optimize.
3) Cloud-based controller implementation
Our cloud service based on Aliyun realizes the behavior of Kube-Proxy, CoreDNS and Ingress Controller to reduce system complexity, such as:
-
Using aliyun’s DNS service PrivateZone, dynamically configure DNS address resolution for ECI instances, supporting Headless Service.
-
SLB provides load balancing capabilities.
-
Ingress routing rules are implemented through layer 7 routes provided by SLB/ALB.
4) Deep optimization for workloads
To take full advantage of the Serverless container’s capabilities in the future, we need to deeply optimize for the nature of the workload.
-
Knative: Knative is a Serverless application framework in the Kubernetes ecosystem, where serving module supports automatic scaling and scaling to zero based on traffic. Based on the Serverless K8s capability, Ali Cloud Knative can provide some new differentiating features, such as automatic scaling down to the lowest cost ECI instance specification, which can guarantee SLA during cold startup time and effectively reduce computing cost. In addition, the Ingress Gateway is implemented through SLB/ALB, which effectively reduces system complexity and costs.
-
In large-scale computing task scenarios such as Spark, vertical optimization is also used to improve the creation efficiency of large-scale tasks. These capabilities have been validated in a cross-domain user scenario in Jiangsu.
Serverless Container infrastructure
For Serverless containers, the key demands of users are as follows:
- Lower computational cost: Elastic cost is lower than ECS, and Long Run application cost is close to ECS package year and month
- Higher elastic efficiency: The CAPACITY expansion speed of ECI is much faster than that of ECS
- Larger scale of elasticity: Different from traditional ECS node expansion, a large-scale container application often requires tens of thousands of cores of elastic computing power.
- Flat computing performance: ECI computing performance requires consistent performance with ECS of the same specification
- Lower migration costs: Perfect integration with existing container application ecosystems
- Lower use cost: fully automated security and operation and maintenance capabilities
Key technology choices for ECI are as follows:
- Lightweight Micro VM based secure Container runtime
For cloud products, the first consideration is security. To this end, ECI chose to implement a secure, isolated container runtime based on the Kangaroo Cloud native container engine and lightweight Micro VM. In addition to resource isolation at run time, a range of capabilities such as network, storage, quota, elastic SLO, etc. between different users are also based on the Ali Cloud infrastructure to achieve strict multi-lease isolation.
In terms of performance, in addition to the kangaroo container engine’s high degree of optimisation in OS/ container, ECI’s optimisation in container execution integrates existing Ali cloud infrastructure capabilities, such as ENI nic passthrough support and storage direct mount. These capabilities ensure that application performance in ECI is equal to or slightly better than that in the existing ECS operating environment.
- Pod based on the basic dispatching unit and standard, open API interface
Different from Azure ACI and AWS Fargate on ECS, the basic scheduling and operation unit based on Pod as Serverless container was determined at the early stage of ECI design, which can be more easily combined with the upper layer Kubernetes orchestration system.
ECI provides Pod lifecycle management capabilities, including Create/Delete/Describe/Logs/Exec/Metrics, etc. ECI Pod has the same capabilities as K8s Pod, except that its sandbox is based on Micro VM rather than CGroup/Namespace. This makes ECI Pod ideal for supporting a variety of K8s applications, including dynamic sidecar injection technologies such as Istio.
In addition, the standardized API shields the concrete implementation of the underlying resource pool, and can accommodate different forms, architectures, resource pools and production scheduling implementations at the same time. The ECI underlying architecture has been optimized and iterated for many times. For example, the creation of kangaroo security sandbox can be offloaded through the MOC card of The Dragon architecture, but these are insensitive to the upper application and cluster arrangement.
In addition, the API needs to be versatile enough to be used in multiple scenarios, so that users can take full advantage of the advantages and value of Serverless containers in ASK/ACK, self-built K8s and hybrid cloud scenarios. This is also an important difference between Ali Cloud ECI and friends.
- ECI and ECS pooled architecture
Through pooling, we are able to fully integrate the computing power of Alicloud elastic computing resource pool, including multiple modes (by volume, SPOT, RI, Saving Plan, etc.), multiple models (GPU/vGPU, new CPU architecture online), diversified storage capabilities (ESSD, local site), etc. So that ECI has more advantages in function, cost and scale, to meet users’ strong demands for computing cost and flexible scale.
Challenges of the Serverless container
The Serverless container resource creation process is a process of creating and assembling computing resources. It is a collaborative assembly process of computing, storage, and network basic IaaS resources. However, unlike ECS, Serverless containers have a number of independent challenges.
According to a 2019 Sysdig container survey, more than 50% of containers have a lifetime of less than 5 minutes. The Serverless container must be started in seconds to meet users’ requirements for starting the Serverless container. The startup speed of the Serverless container itself is affected by the following factors:
- Creation and assembly of underlying virtualized resources
The resource preparation time can be optimized to sub-second level through the optimized ECI of end-to-end control links.
- Micro VM OS startup time
The Kangaroo container engine is heavily tailored and optimized for container scenarios, greatly reducing OS startup time.
- Image download time
Downloading the image from the Docker image repository and decompressing it locally is a time-consuming operation. The download time varies from 30 seconds to a few minutes depending on the image size. In traditional Kubernetes, the worker node caches the downloaded image locally so that the next startup does not repeat the download and decompression. To achieve extreme elastic cost efficiency, ECI and ECS adopt a strategy of pooling and computing storage separation architecture, which also means that it is impossible to cache container images using local sites in the traditional way.
To do this, we implemented an innovative solution: a container image can be made into a snapshot of the data disk. When the ECI is started, if a mirror snapshot exists, a read-only data disk can be created based on the snapshot. When the instance is started, the mounted data disk can be automatically mounted. Container applications can start using the mounted data disk as rootFS. Based on pangu 2.0 architecture and the extreme I/O performance of Ali Cloud ESSD cloud disk, we can reduce the loading time of the image to less than 1 second.
There is still a lot of room for growth in this area, and the next step is to work with multiple teams to optimize the startup efficiency of Serverless containers.
In addition, scheduling of Serverless containers is more concerned with the deterministic supply of resource elasticity than ECS. Serverless containers emphasize on-demand use, whereas ECS is more about buying reservations in advance. In large-scale container creation scenarios, elastic SLO protection for a single AZ and a single user is a big challenge. In a series of user support for e-commerce promotion, New Year’s Eve activities and the recent outbreak of epidemic protection, users attach great importance to whether the cloud platform can provide deterministic SLO of elastic resource supply. In addition, with the cooperation of Serverless K8s, upper scheduler and ECI elastic supply strategy, we can give users more control over elastic resource supply and balance different demand dimensions of elasticity, such as cost, size and holding time.
Efficiency in concurrent creation of Serverless containers is also critical. In highly elastic scenarios, users require 500 Pod copies to be started within 30 seconds to support burst traffic. Computing services such as Spark and Presto require more concurrent startup.
To effectively reduce computing costs, the Serverless container should increase deployment density. Due to MicroVM technology, each ECI instance has a separate OS kernel. There are also helper processes that run inside ECI for compatibility with K8s semantics. Currently each ECI process has an additional overhead of around 100M. Similar to EKS on Fargate, each Pod has a system overhead of around 256M. This reduces the deployment density of Serverless. We need to sink common overhead on top of the infrastructure and even unload it through MOC hardware and software.
future
In the expected range, major cloud vendors will continue to invest in Serverless container to increase the differentiation of container service platform. As mentioned above, cost, compatibility, creation efficiency, and resilient provisioning are important core capabilities of Serverless container technology.