Introduction: The construction of edge computing platform, cloud native technology system with Kubernetes as the core, is undoubtedly the best choice and construction path at present; However, as the cloud native system is huge and its components are complex, it will face great challenges and difficulties to sink the system to the edge, and it is also full of great opportunities and imagination. If business applications want to truly practice the edge cloud native system, they need to jointly realize the concept, system design, architecture design and other aspects, so as to give full play to the edge’s advantages and value.

Author: Duan Jia, Xinsheng

The history of cloud computing is the history of virtualization. In the past 20 years, cloud computing and Internet have promoted the rapid development of each other, and the central cloud technology has become the common infrastructure of the whole society. With the continuous development of the Internet of Things, artificial intelligence and other technologies, especially the development of the industrial Internet, the central cloud computing began to be overshadowed, and the decentralized edge computing is once again expected. If central cloud computing is driven by technological innovation, then edge computing must be driven by business value.

So what exactly is edge computing? What are the classifications of edge computing? What is the relationship between edge computing and central cloud? This paper will elaborate on the understanding and thinking of edge computing and cloud native.

Understanding and thinking of edge computing

Definition of edge computing

Currently, edge computing is not precisely defined. From the perspective of IT cloud computing, edge computing is regarded as the expansion of central cloud computing. Definition of the edge of the computing industry alliance on the calculation of edge: “on the edge of the network close to the content or data source side, fusion, the core competence of the network, computing, storage, application open platform, came to the edge of intelligence services and meet the digital industry in agile connection, real-time business, data optimization, application of intelligent security and privacy protection, the key of demand”. From the perspective of CT telecom, edge computing was also originally referred to as mobile edge computing (MEC). The European Telecommunications Standards Institute (ETSI) defines MEC as “Mobile edge computing provides an IT service environment and cloud computing capability at the edge of mobile networks, within wireless access networks (RAN) and in the immediate vicinity of mobile users”.

Edge computing has different definitions, but the core idea is basically the same — edge computing is a new form of distributed computing based on the core technology of cloud computing and built on edge infrastructure. It provides computing power close to the end users at the edge and is a kind of on-site cloud computing close to the data source.

With its powerful data centers, central cloud computing provides large-scale pooling and elastic expansion of computing, storage, network and other infrastructure services for service applications. It is more suitable for non-real-time, long-cycle data and service decision scenarios. Margin calculation is focused on real-time data, short cycle, local decisions, such as business scenarios, such as the popular audio and video on the Internet, virtual reality, IoT, industry and even yuan universe, the workload sinks to terminal equipment or close to the end user, to achieve lower network latency, improve the user experience.

“Octopus” edge calculation

Edge is the edge distributed computation of relative center computation. The core goal of edge computing is to make fast decisions and extend the computing power of the central cloud to the “last mile”. Therefore, it cannot be independent of the central cloud, but under the overall architecture of cloud-edge-end, there are centralized management and control decisions and decentralized edge autonomous decisions, namely octopus edge computing.

As shown in the cartoon above, the octopus has a centralized brain with 40% of its body neurons, and the remaining 60% is distributed in its legs, forming a structure with 1 brain for general control and coordination and N cerebellums for decentralized execution. One brain is good at global scheduling, processing and analyzing non-real-time and long-cycle big data; N cerebellum focuses on local and small-scale data processing, which is suitable for on-site, real-time, short-cycle intelligent analysis and fast decision making.

Octopus edge computing adopts the distributed cloud and edge integrated architecture of central cloud and edge computing. After collecting data from massive terminals, real-time decision processing of small-scale local data is completed at the edge, while complex and large-scale global decision processing is collected to the central cloud for in-depth analysis and processing.

The position of the edge calculation

Edge computing is located between the central cloud and the terminal. It descends the cloud computing capability from the center to the edge, solves specific business requirements through the architecture of cloud side collaboration, and minimizes transmission delay, which is also the core value of edge computing. However, the network transmission path between the central cloud and the terminal is via the access network (distance 30 km, delay 5 to 10 ms), convergence network, intercity network (distance 50 to 100 km, delay 15 to 30 ms) to the backbone network (distance 200 km, delay 50 ms), The data center is finally accessed (assuming that all IDCs are on the backbone network). The time consumed data is the dial measurement statistics of normal network congestion, that is, the actual delay data perceived by the service side. Although it is not very accurate, it is sufficient to assist architecture decision making.

The cloud computing capability gradually descends from the center to the edge, the number of nodes increases, the coverage scope shrinks, and the cost of operation and maintenance services increases rapidly. According to the status quo of domestic networks (there are several backbone networks in China, namely CHINANET and CN2, China Unicom CNCNET and Mobile CMNET), backbone network nodes, intercity network nodes, convergence network nodes, access network nodes, and tens of thousands of business field computing nodes can be placed edge computing. So the range is too broad to form a unified standard. So we say that central cloud computing is defined by technology and edge computing is defined by network and business requirements.

There are many participants in the edge computing ecosystem. Three key service providers, cloud manufacturers, device manufacturers and operators, as well as some new AI service providers, are all extending their existing advantages to expand more customers and market space. With the help of the Internet of things, equipment vendors are gradually building professional clouds with single functions. Cloud vendors descend from centralized public clouds to distributed regional clouds. Regional clouds are connected through cloud networking to form a cloud with larger coverage. In the Internet era, operators were completely shielded by public cloud and prosperous mobile applications and could only serve as conduits. However, in the era of edge computing, business and network define edge computing and operators return to the focus and cannot be replaced.

The type of edge computation

(1) Edge calculation of network definition:

By optimizing the network path between terminals and the cloud center, the central cloud capability is gradually reduced to the nearest terminal, enabling services to be accessed nearby. From center to edge, there are three types: regional cloud/central cloud, edge cloud/edge computing, and edge computing/local computing.

  • Regional cloud/central cloud: Expand and extend the central cloud computing service on the backbone network, expand the central cloud capability to the region, achieve full coverage of the region, solve the time consuming on the backbone network, optimize the network delay to about 30ms, but logically it is still the central cloud service.
  • Edge cloud/edge computing: Extend the service of central cloud computing along the network nodes of operators to build small and medium scale cloud service or cloud-like service capability, and optimize the network delay to about 15ms, such as multi-access Edge computing (MEC) and CDN.
  • Edge/local calculation: the main is close to the terminal site equipment and services ability, getting rid of the terminal part of the logic, and realizes automatic edge intelligence service, is controlled by a cloud on the edge of the resource scheduling, the application of management and business choreography ability, such as network delay optimization to about 5 ms, such as multi-function equipment, intelligent router, etc.

In general, edge computing based on the definition of network is more oriented to consumer Internet business and new 2C business, which sinks the capacity and data of cloud center to the edge in advance. In addition to the classic CDN and video and voice business, there is also the meta-universe which is on fire this year.

At present, most consumer-oriented Internet services are supported by the central cloud computing capability installed in the backbone network, with a delay of 30ms to 50ms, much smaller than the delay of back-end business processing in the cloud. The original intention of computing power sinking to the edge is mainly to realize the pressure of massive requests from the central cloud to disperse and optimize user experience, which is icing on the cake rather than providing timely help to the business.

Let’s talk about the carrier network. The central cloud computing technology virtualizes all the internal networks of the data center, that is, the intra-cloud network, and derives many products such as VPC and load balancing. The external dc is almost completely shielded from the carrier’s network and provides only elastic public IP addresses and Internet egress bandwidth services. The central cloud computing is not integrated with the carrier’s network. But the evolution of cloud computing from the center to the edge, is strongly dependent on the network will centre and the peripheral cloud link, if the center is the brain of cloud, edge computing is smart antenna, so is the neural network, is the artery blood vessels, but in fact the whole network planning and construction prior to the development of cloud computing, and is not specialized services of cloud computing, Therefore, the central cloud computing and operator network need to be integrated, that is, cloud network convergence. The ultimate goal of cloud network convergence is to realize the network scheduling and scheduling of cloud capabilities, and the cloud rapid definition of network capabilities. With the help of new business requirements and cloud technology innovation, we hope to drive profound transformation, upgrading and opening up of carrier network architecture.

At present, the network capability greatly limits the development of cloud computing, especially in the process of edge computing and Internet of Things construction; Cloud network integration and computing power network are still the exclusive game of operators. The new generation OF 5G is a disruptive technological change, which has triggered a disruptive change in the whole field. It only solves the problem of massive device access and device low-delay access, while the overall supporting and solutions of the back-end are obviously lagging behind. As far as the current situation is concerned, it is still an awkward situation for 5G to find business. In the future, 5G will bring greater changes and value to the real industry (ports, docks, mines, etc.) than to the consumer sector.

(2) Edge computing defined by business:

In addition to the consumer oriented edge scene of the Internet, edge computing is more oriented to the real industry and intelligent society derived scene.

For the real industry scenario, due to historical reasons, there are a lot of heterogeneous infrastructure resources at the edge and site. Through the construction of edge computing platform driven by business demand, not only the existing infrastructure resources should be integrated, but also the central cloud computing technology and capacity should be sunk to the edge and site, so as to realize a large amount of existing business operation control on the cloud and massive data into the lake, so as to support the digital transformation of the whole enterprise.

For the scenarios derived from intelligent society, the newer the business is, the more sensitive it is to network delay, the larger the data volume is, and the structured data is gradually transformed into unstructured data, requiring the support of artificial intelligence, neural network and other advanced intelligent technologies.

At present, new business scenarios sensitive to network delay all adopt the distributed architecture strategy of cloud master control management and on-site real-time computing of devices to reduce the strong dependence on the network. Business-oriented edge computing is divided into intelligent devices/professional cloud and industry edge/industry cloud:

  • Smart device/professional cloud: Based on cloud computing capability, it provides integrated and competitive solutions for smart devices, including smart devices, cloud services and end-to-end cloud edge services, such as video surveillance cloud, G7 freight Internet of Things, etc.
  • Industry edge/industry cloud: Also based on cloud computing capability, centering on industry applications and scenarios, provide suite products and solutions, such as logistics cloud, aerospace cloud, etc.

Overall, based on the edge of the business definition, more is an intelligent device and the entity industry, the intelligent equipment, from the AVG, intensive storage, single function such as mechanical arm intelligent device, the unmanned aerial vehicle (uav), unmanned vehicles such as ultra complex intelligent devices, cloud computing is not only to support the operation of the equipment control management applications, At the same time, with the help of the central cloud computing capacity to expand to the edge, to solve the problem of centralized standardized management on the cloud of such products; For the edge of the industry, through the cloud computing technology, combined with the abstract summary of the industry scene, to build the industry common products and solutions, with the acceleration of the construction of the entire industry Internet, is the key direction of the future development of edge computing.

summary

For large-scale enterprises, the cloud side scenario is very complex. The construction of central cloud computing platform and edge computing platform not only deals with business needs, but also faces many infrastructure problems. The edge network link is faced with the problems of multi-operator backbone network, multi-cloud operator network and multi-cloud cloud network convergence. The end-to-end access network is faced with the problem of 5G network sharing by multiple operators, and many problems can only be dealt with by means of governance, rather than completely solved from the level of technology platform.

In general, edge computing has a wide range of scenarios, and currently the whole industry lacks classic cases and standards. Therefore, the promotion of edge computing must be based on the overall planning of real business scenarios and demands, and the gradual construction of value.

Kubernetes moves from center to edge

Kubernetes follows the application as the center of the technical architecture and thought, with a set of technical system to support any load, running on any infrastructure; The differences of infrastructure are shielded downward to realize unified scheduling and arrangement of underlying basic resources. The container image standardizes applications upward to realize automatic application load deployment. To break through the boundary of central cloud computing, expand the cloud computing capability seamlessly to the edge and site, and quickly build the integrated infrastructure of cloud side.

Expanding cloud native technology from the center to the edge not only realizes the unified technical architecture of cloud side infrastructure, but also realizes the free arrangement and deployment of cloud side services. Compared with Kubernetes’ revolutionary innovation in central cloud, the edge scene has obvious advantages, but its disadvantages are also fatal. Because there are special situations such as limited resources and unstable network on the edge side, different Kubernetes edge schemes need to be selected according to different business scenarios.

Kubernetes architecture and marginalization challenges

Kubernetes is a typical distributed architecture. The Master controller node is the “brain” of the cluster, responsible for managing nodes, scheduling pods and controlling the running status of the cluster. Node A working Node that runs containers and monitors and reports the running status of containers. Edge computing scenarios present the following obvious challenges:

  1. The centralized storage architecture with strong consistent status is a dacheng product of central cloud computing. It implements business continuity services based on large-scale scheduling of pooled resources.
  2. The Master control node and the Worker node realize real-time synchronization of status tasks through the list-watch mechanism. However, due to heavy traffic, the Worker node completely relies on the Master node for persistent data and has no autonomy.
  3. Kubelet carries too much logic processing, various containers run compatible with various implementations, and Device Plugin hardware Device driver, which occupies up to 700M resources. It is too heavy for edge nodes with limited resources, especially for edge devices with low configuration.

There is no unified standard for edge computing, which involves large scope and complex scene. The mainline version of Kubernetes open source community has no adaptation plan for edge scenarios.

Kubernetes marginal running scheme

For central cloud computing and edge computing, such cloud edge distributed architecture, it is necessary to adapt Kubernetes into an architecture suitable for edge distributed deployment, realize unified management through multi-cluster management, and realize the central cloud management and edge operation. The whole is divided into three schemes:

  • Cluster Cluster: the Kubernetes standard Cluster sink to the edge, the advantage is that there is no need for Kubernetes to do customized RESEARCH and development, at the same time can support Kubernetes multiple versions, support business really achieve cloud architecture consistent; The disadvantage is that management resources occupy a lot. The solution is suitable for regional cloud/central cloud, edge computing/local computing and large-scale industrial edge scenarios.
  • Single Node: The advantages of simplifying Kubernetes and deploying it on a single node device are consistent with that of the Cluster Cluster solution. The disadvantages are that Kubernetes capability is incomplete, the occupancy of resources will increase the cost of equipment, and the business applications can not ensure the consistent deployment and operation of the architecture of cloud edge, which does not solve the actual problem.
  • Edge Node Remote Node: Based on the secondary development of Kubernetes to enhance the expansion, the decoupling of Kubernetes is adapted to the scene of cloud side distributed architecture. The centralized deployment of Master management Node and decentralized deployment of Worker management Node.

In addition, consistency is the pain point of edge computing. Adding a Cache to edge can realize edge autonomy in the special case of disconnection and ensure data consistency in the normal case. There is also the problem that Kubelet is relatively heavy. With Kubernetes giving up Docker, it has begun to simplify. At the same time, hardware update iteration is faster, compared with a small amount of hardware cost, keep Kubernetes original and universal as large. In fact, I would prefer the Kubernetes community itself to provide adaptive marginalization solutions, and consider adding caching mechanisms for Kubelet.

Kubernetes edge containers are growing fast

Kubernetes has become the de facto standard for container arrangement and scheduling. For edge computing scenarios, various domestic public cloud vendors have opened source their edge computing cloud native projects based on Kubernetes, such as OpenYurt contributed by Ali Cloud to CNCF. Using the Remote Node solution, it is the industry’s first open source non-invasive Edge computing cloud native platform. Adhering to the non-invasive design concept of “Extending your native Kubernetes to Edge”, it is capable of Extending Edge computing to all scenarios. Huawei, Tencent, Baidu, etc., also open source their edge container platform.

The rapid development of edge container drives innovation in the field, but to some extent, it also makes it difficult to choose when constructing edge computing platform. From the perspective of technical architecture, the overall architectural idea of several edge container products is mainly to decouple Kubernetes into edge computing scenarios suitable for cloud edge, weak network and resource scarcity. There is no great difference in essence. The same is true from the perspective of product functions, which basically cover cloud-side collaboration, edge autonomy and unitary deployment functions.

How to build cloud side integration cloud native platform

At this stage, around Kubernetes container platform, build cloud edge integration cloud native infrastructure platform capability is the best choice of edge computing platform, through the cloud unified container multi-cluster management, to achieve decentralized cluster unified management, and standardize Kubernetes cluster specifications configuration:

  • Standard cluster (large-scale) : supports a large-scale cluster with more than 400 nodes, consisting of ETCD + Master 3 8c16G, Prometheus + Ingress 5 8c16G, and N * Work nodes; It mainly applies to the running scenarios of cloud native applications with large business scale.
  • Standard cluster (medium scale) : supports clusters with less than 100 nodes, ETCD + Master + Prometheus 3 8c16G, N * Work nodes; It is mainly for scenarios with medium business scale;
  • Edge native container cluster: Cluster management nodes are deployed on the cloud, and edge nodes are deployed on service sites independently to support single-service scenarios, such as IoT physical device access protocol parsing applications and video surveillance analysis AI algorithm models.

The optimal container cluster scheme is selected according to the requirements of business scenarios. The edge container cluster scheme is quite different from other cluster schemes. Other clusters still maintain the same central cluster services, centralized and pooled basic resources, and all applications share the entire cluster resources. The edge container cluster Master management nodes are centrally deployed and shared. Worker nodes are scattered in the business site, and can be added on demand, operated and maintained independently and used exclusively.

At present, it is difficult to have a unified open source product in the edge container field in a short time. Therefore, it is suggested to integrate the edge native container cluster through the standard Kubernetes API at this stage. This kind of neutral solution that is compatible with all edge containers is suggested as OpenYurt, non-invasive design, if you have to choose one. The overall technical architecture and implementation are more elegant.

OpenYurt: Open source practice for intelligent edge computing platform

OpenYurt is a release for edge scenarios based on upstream open source project Kubernetes. It is the industry’s first intelligent edge computing platform based on cloud native technology system and “zero” intrusion. It has a full range of “cloud, edge and end integration” capabilities, and can quickly realize the efficient delivery, operation, maintenance and management of massive edge computing services and heterogeneous computing power.

Design principles

OpenYurt uses the mainstream cloud Edge distributed collaboration technology architecture to extend your native Kubernetes to Edge and adheres to the following design principles:

  • “Cloud edge integration” principle: on the basis of ensuring the user experience and product capability consistent with the central cloud, the cloud native capability is sunk to the edge through the cloud edge control channel to realize a massive number of intelligent edge nodes and business applications, and the infrastructure is promoted to a major breakthrough of the industry-leading cloud native architecture.
  • The “zero intrusion” principle: Ensure that the user-facing API is completely consistent with native Kubernetes. Through proxy node Network traffic, a new layer of encapsulation and abstraction is added to the application lifecycle management of Worker working nodes to achieve unified management and scheduling of distributed working node resources and applications. “UpStream First” open source code;
  • “Low load” principle: on the basis of ensuring the functional characteristics and reliability of the platform, taking into account the versatility of the platform, strictly limiting the resources of all components, following the design concept of minimization and simplification, so as to maximize the coverage of edge devices and scenarios.
  • “One stack” principle: OpenYurt not only realizes the enhanced function of edge operation and management, but also provides a supporting operation and maintenance management tool to achieve the mutual efficient one-key conversion of native Kubernetes and Kubernetes cluster supporting edge computing capacity;

features

OpenYurt is based on Kubernetes’ powerful container scheduling and scheduling capabilities, which is enhanced for limited edge resources and unstable network. Expand the original capabilities of the central cloud to decentralized edge nodes to realize low-latency services for edge oriented businesses; At the same time, the reverse security control o&M link is opened to provide convenient and efficient unified O&M management capabilities of cloud-based centralized edge devices and applications. Its core functions and features are as follows:

  1. Edge node autonomy: in edge computing scenarios, cloud side control network cannot guarantee continuous stability. Enhanced adaptation can solve the problems of no stateless data of native Worker working nodes, strong dependence on Master to control node data and strong state consistency mechanism, which are not suitable for edge scenarios. So that in the case of cloud side network is not smooth, the edge workload is not expelled, and the business can continue normal service; Services can be restored even if edge nodes are restarted during network disconnection. Namely, the temporary autonomy capability of edge nodes.
  2. Collaborative operation and maintenance channel: in the edge computing scenario, the cloud side network is not on the same network plane, and the edge nodes are not exposed to the public network. Therefore, the central management cannot establish an effective network link channel with the edge nodes, resulting in the failure of all the native Kubernetes OPERATION and maintenance APIs (logs/exec/metrics). Adaptation to enhance Kubernetes ability, in the initialization of edge point, the establishment of a reverse channel between the central management and control and edge node, to undertake the original Kubernetes OPERATION and maintenance APIs (logs/exec/metrics) traffic, centralized unified operation and maintenance;
  3. Edge unitary load: in edge computing scenarios, business-oriented cloud side collaborative distributed architecture of “centralized management and decentralized operation” is generally adopted. On the management end, the same services must be deployed on nodes in different regions. For the edge end, Worker work sections are generally scattered in the wide area and have strong regionalism. There are obvious isolation attributes such as network connectivity, resource sharing and resource heterogeneity among nodes across regions. Adaptation enhances Kubernetes capability, and realizes unitary management and scheduling of edge load based on resource, application and flow layers.

Through the OpenYurt open source community to introduce more participants to build, joint research and development method to provide more optional professional functions, OpenYurt features are gradually improved, and expand the coverage capacity:

  1. Edge device management: In edge computing scenarios, end-side devices are the real service objects of the platform. Based on the concept of cloud native, abstract non-invasive, scalable device management standard model, seamless integration of Kubernetes workload model and IoT device management model, to achieve the last kilometer of platform enabling business. At present, the EdgeX Foundry open source project is integrated through the standard model, which greatly improves the management efficiency of edge devices.
  2. Local resource management: In edge computing scenarios, the existing block devices or persistent memory devices on edge nodes are initialized into containers for cloud native and convenient storage. Two local storage devices are supported: (1) LVM created based on block devices or persistent memory devices; QuotaPath is created based on block devices or persistent memory devices.

OpenYurt design architecture and principle

(1) Design architecture

Native Kubernetes is a centralized distributed architecture. The Master control node is responsible for managing scheduling and controlling the running state of the cluster. Worker Work node is responsible for running containers and monitoring/reporting running status;

Based on native Kubernetes, OpenYurt decouples the centralized distributed architecture (Cloud Master, Cloud Worker) into centralized decentralized Edge operation (Cloud Master,Edge Worker) for Edge scenarios. An octopus cloud-edge collaborative distributed architecture with a central brain and multiple decentralized cerebellums is formed. Its main core points are as follows:

  1. Decentralize metadata centralized and strong consistent state storage to edge nodes, and adjust the original Kubernetes scheduling mechanism to realize the abnormal status of autonomous nodes does not trigger rescheduling, so as to achieve the temporary autonomy of edge nodes;
  2. Ensure the integrity and consistency of Kubernetes capabilities, while compatible with the existing cloud native ecosystem, as far as possible to sink the cloud native system to the edge;
  3. The mode of pooling large-scale resources in the center and delegating shared resources for multiple applications is adapted to regional small-scale or even single-node resources to achieve more refined unitary workload scheduling and management in edge scenarios.
  4. To meet the requirements of practical edge business scenarios, the open community can seamlessly integrate device management, edge AI, streaming data, etc., and provide more edge application scenarios with unboxed universal platform capabilities for practical edge business scenarios.

(2) Implementation principle

OpenYurt implements the concept of cloud native architecture and realizes the capabilities of cloud edge collaborative distributed architecture and center control edge operation oriented to edge computing scenarios:

  • Aiming at the autonomy of Edge nodes, on the one hand, YurtHub component is added To implement Edge To Cloud Request proxy, and the latest metadata is persisted in Edge nodes by caching mechanism. On the other hand, the YurtControllerManager component is added to take over the original Kubernetes scheduling, so that the edge autonomous node does not trigger rescheduling when the status is abnormal.
  • Aiming at the integrity and ecological compatibility of Kubernetes capability, the YurtTunnel component is added To build the Cloud To Edge Request reverse channel To ensure the consistent ability and user experience of operation and maintenance management products of Kubectl, Promethus and other centers. At the same time, other capabilities of the center are sunk to the edge, including various workloads and Ingress routing;
  • For edge unitized management ability, with the addition of YurtAppManager component, at the same time NodePool collocation, YurtAppSet (formerly UnitedDeployment), YurtAppDaemon, ServiceTopology edge resources, such as Three-layer unitary management of workload and flow;
  • In view of the actual business platform can assign edge, edge by adding NodeResourceManager storage is convenient to use, through the cloud was achieved by introducing YurtEdgeXManager/YurtDeviceController native mode management edge devices.

Core components

All new functions and components of OpenYurt are implemented in Addon and Controller mode. The core mandatory and optional components are as follows:

1. YurtHub(Mandatory) : Has two operating modes: Edge and Cloud. It runs in the form of Static Pod on all nodes at the cloud edge as the SideCar of node traffic, the access traffic of components on proxy nodes and Kube-Apiserver. Edge YurtHub will cache data to achieve temporary autonomy of edge nodes.

2. YurtTunnel(Mandatory) : It is composed of Server Server and Agent client to construct a two-way authentication and encryption cloud side reverse tunnel, and forward the request traffic from cloud center to edge native Kubernetes OPERATION and maintenance APIs (logs/exec/metrics). The Server is deployed in the cloud center with a Deployment workload and the Agent is deployed in the edge node with a DaemonSet workload.

3. YurtControllerManager(mandatory) : cloud center Controller, which takes over the NodeLifeCycle Controller of native Kubernetes and does not expel Pod applications of autonomous edge nodes when the cloud side network is abnormal. There is also YurtCSRController, which approves certificate requests for edge nodes.

4. YurtAppManager(Mandatory) : implements unitary management and scheduling of edge loads, including NodePool: manages node pools. YurtAppSet: original UnitedDeployment, node pool dimension business load; YurtAppDaemon: Daemonset workload for the node pool dimension. Deploy in the cloud hub with the Deploymen workload.

NodeResourceManager(Optional) : manages local storage resources on edge nodes. You can modify ConfigMap to dynamically configure local resources on the host. Deploy to edge nodes with the DaemonSet workload.

6. YurtEdgeXManager/YurtDeviceController (optional) : through the edge of cloud native mode control equipment, the current support EdgeX Foundry integration. YurtEdgeXManager is deployed in the cloud center as a Deployment workload, and YurtDeviceController is deployed on edge nodes as a YurtAppSet workload. Deploy a set of YurtDeviceController based on NodePool NodePool.

7. Operation and Maintenance management component (optional) : In order to standardize Cluster management, OpenYurt community launches the YurtCluster Operator component, providing cloud native notoriety Cluster API and configuration, automatic deployment and configuration of OpenYurt related components based on standard Kubernetes, Implement the full life cycle of OpenYurt clusters. The old Yurtctl tool is recommended for use only in test environments.

In addition to core functions and optional professional functions, OpenYurt continues to implement the concept of cloud edge integration and pushes the rich ecological capabilities of cloud native to the edge to the maximum extent. It has implemented edge container storage, edge guard workload DaemonSet, edge network access Ingress Controller, etc. Service Mesh, Kubeflow, Serverless and other functions are in the planning. Wait and see.

The current challenge

(1) Cloud side network

In the edge computing scene, the most mentioned is the poor and unstable cloud side network. In fact, the domestic basic network began to be comprehensively upgraded in 2015, especially after the comprehensive completion of the “Snow Bright Project”, the basic network has been greatly improved. The graph above is taken from the 48th China Internet Development Report, showing that fixed Internet access accounts for 91.5% of the total; Wireless Internet access is already 4G, 5G premium networks.

The real challenge lies in the cloud-side network networking. In the scenario where public clouds are used, the public cloud blocks the data center network and only provides the Internet egress bandwidth. The access to the cloud side through the Internet usually requires only secure data transmission and is not complicated. For private self-built IDC scenarios: it is not easy to get through the cloud network, mainly because the carrier network is not fully productized, at the same time, private IDC layer on layer firewall and other complex products, need professional network personnel to complete the implementation work.

(2) List-watch mechanism and cloud edge flow

List-watch mechanism is the essence of Kubernetes design. It obtains relevant events and data through active listening mechanism, so as to ensure that all components are loosely coupled and independent of each other and logically integrated. The List request returns a full amount of data. Once Watch fails, Relist needs to be Relist again. However, Kubernetes has considered the management of data synchronization optimization, node kubelet only listens to the node data, Kube-proxy will listen to all Service data, data volume is relatively controllable; At the same time, gRPC protocol is adopted, and the text packet data is very small compared with the service data. The figure above is the pressure measurement data monitoring chart of the cluster scale of node 1200.

The real challenge lies in the delivery of basic mirroring and application mirroring. The current basic mirroring and service mirroring, even in the central cloud, are still exploring various technologies to optimize the bottleneck of rapid image distribution. In particular, edge AI applications are generally composed of push application + model library. The mirror image of the estimated application is relatively small, and the size of the model library is very small. Meanwhile, the model library needs to be updated frequently with self-learning.

(3) Edge resources and computing power

The resource situation of the edge needs to be subdivided. For the edge computing of operators’ network and consumers, resources are relatively sufficient. The biggest challenge is resource sharing and isolation. For the edge of the real industry, there will be no small IDC support, edge resources are very sufficient, enough to sink the whole cloud native system; Edge in intelligent equipment, resources are relatively scarce, but usually through an intelligent edge of the box, devices are connected at one end, and connected to one central control service, from the perspective of the AI edge of the box of above, the overall configuration speed faster, in the long run, calculate the force on the edge of the rapid increase in order to meet the demand of more complex and more intelligent scene.

(4) Kubelet is heavy and occupies a lot of resources

For the problem that Kubelet is heavy and occupies a lot of resources, it is necessary to have an in-depth understanding of the allocation and use of node resources. Generally, node resources are divided into four layers from bottom to top:

1. Resources required for running the operating system and system daemon processes (such as SSH and Systemd).

2. Resources needed to run Kubernetes agent, such as Kubelet, container runtime, node problem detector, etc.;

3. Resources available to Pod;

4. Resources reserved to the expulsion threshold.

There is no standard for resource allocation Settings at each layer, and the configuration needs to be balanced according to the cluster situation. Reserved memory = 255MiB + 11MiB * MAX_POD_PER_INSTANCE; Assuming running 32 Pods, up to 90% of the memory can be allocated for business use, making The Kubelet resource footprint relatively small.

At the same time, adjust the response to the high availability requirements of the business. For edge scenarios, it is generally not recommended to run a large number of Pods on a node to keep them large.

Cloud – edge management cooperation model for business applications

The distributed business application architecture based on central cloud is substantially different from the cloud side distributed collaborative business application architecture. In the central cloud, it is more based on DDD business, which divides the complex business system into relatively independent services and builds a loosely coupled distributed application as a whole. However, in the cloud side distributed scenario, more emphasis is placed on centralized operation control and decentralized operation support. Centralized management and operation systems are centralized in the cloud center to achieve centralized management and control, and applications supporting real-time business operation are dispersed to the edge to achieve low latency and quick response.

From the perspective of business applications, the financial/operation and planning/management layers belong to the application of control and operation, which needs to realize centralized and strong control through unified convergence of the central cloud. Not sensitive to delay, high requirements for security, big data analysis ability; The control, sensing/execution, and production layer 3 are operational support applications, and the central cloud can also be a priority. If the business scenario is sensitive to delay, the edge computing capability is considered to realize decentralized low-delay response.

From the point of view of request response, delay insensitive (above 50ms) is limited considering deployment in central cloud computing and cloud edge product (CDN) implementation; If it is sensitive to delay (less than 10ms) and the backbone network of the operator cannot support it completely, it is considered to build edge computing platform, and the business is faced with considerable investment and personnel.

Take the field of physical logistics as an example. OT is a typical management and operation system in the classic OTW system (OMS order management system, WMS warehouse management system and TMS transportation management system), so it is recommended to deploy it in the central cloud. Through the data aggregation of the central cloud, the trans-regional business such as single order splitting and multimodal transportation can be realized. W is a warehouse management system that manages tasks on four walls. It is an operational support application, and the warehouse usually has some automation equipment, so you can consider deploying W on the edge.

conclusion

The construction of edge computing platform, cloud native technology system with Kubernetes as the core, is undoubtedly the best choice and construction path at present. However, as the cloud native system is huge and its components are complex, it will face great challenges and difficulties to sink the system to the edge, and it is also full of great opportunities and imagination. If business applications want to truly practice the edge cloud native system, they need to jointly realize the concept, system design, architecture design and other aspects, so as to give full play to the edge’s advantages and value.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.