This article is the cloud native application best practice hangzhou station activity speech arrangement. Hangzhou station invited Wen Ming, VP of Apache APISIX project, Mo Hongbo, senior engineer of Youpai Cloud Platform Development Department, Wang Fakang, technical expert of Ant Financial, and Zhang Chao, middleware development engineer of Youzhan, to share their experience and experience of cloud native application. The following is the content of “Evolution of Youzhan Unified Access Layer Architecture” shared by Zhang Chao.
Chao Zhang is a development engineer of Youzan middleware team and an expert in gateway and Service Mesh. He is keen on technology and has in-depth research on Golang, Nginx and Ruby languages.
Hello everyone, I am Zhang Chao from Youzan, a development engineer of Youzan middleware team. Today, we will share the evolution of the access layer architecture.
It is based on OpenResty and Nginx. It mainly has the standard C module, the SELF-developed Nginx C module, and the module based on Lua. As the public network gateway for the Traffic Shaping of the like service, it provides Traffic Shaping, including Traffic limiting, security-related functions such as WAF, and request routing. Request routing includes the standard blue-green advertising and gray advertising functions, and load balancing functions. Today’s sharing, mainly from the following three aspects to in-depth analysis:
-
Old access layer architecture pain points
-
New architecture design analysis
-
Summary of new architecture design
Old access layer architecture pain points
Firstly, the design and analysis of the new architecture are started from the relevant pain points of the old version of access layer architecture.
The image above shows a vertical view of the older access layer architecture, which was proposed several years ago. It was popular to use Redis for configuration synchronization, and its natural master-slave synchronization protocol was a good fit. The yellow arrow line is the configuration synchronization. The data is synchronized from the Redis master to the Redis slave on each instance. Then the local YZ7 will rotate the local Redis and read the data into its own memory.
Why the K8ssync Controller on the bottom right? As K8S gradually became popular in the past few years, many applications began to go containerized.
YZ7 is based on OpenResty, the entire stack is based on Lua, and lua is not part of the K8S ecosystem. If you want the service in watch K8S, you need to know what endpoints it has in real time. It’s possible to do this with Lua, but you need to start all over again with a client-Go library like the K8S standard, which is not worth the cost. Therefore, a K8SSSync Controller written in GoLang will be applied. It is responsible for obtaining endpoints data from K8S, which it is interested in, and writing it to Redis Master again via YZ7-configured API. Finally, redis Master distributes it to each YZ7 instance.
Disadvantages of the old access layer architecture
-
Redis Master single point problem: no Redis Closter or Sentinel scheme is used, just simple master-slave mode, when problems occur, the configuration cannot be delivered.
-
When the access layer is deployed according to the scale of multiple machine rooms, because the Redis master is a single point, it must exist in one machine room. When the data is synchronized from the machine room to the Redis slave of other machine rooms, it is easily affected by the stability of the dedicated line between the machine rooms. Poor stability leads to high delay of configuration synchronization.
-
When the Redis Master has a problem, it means that the ENDpoints data of the K8S internal service synchronized from the K8sSync Controller cannot be synchronized to the YZ7 instance in real time. If the point of some service instances is cleared, the access layer does not immediately perceive it. In this way, when the request comes in, the point IP that has been offline is still being used. As a result, the request will be 502 and 504, causing service unavailable. Another disadvantage is that the K8ssync Controller is also a single point due to historical reasons. If it fails, the K8S server will not be able to synchronize, which will also lead to service unavailability and even cause large-scale failure.
-
The configuration does not have attribute characteristics. Unable to do multiple processing at the configuration level, including gray distribution of configuration. Gray distribution of the configuration of the word is my personal put forward, first keep this question, will be unveiled in detail.
The new architecture is designed with three components
With all the flaws of the old access layer, the next step is to design a new architecture that addresses those flaws. Of course, there are some architecture-related points to follow when designing a new architecture.
-
The first is to solve the basic single point problem and provide security for service availability.
-
Components need to be stateless, grayscale, rollback, and observable.
-
Stateless: indicates that the service can expand and shrink flexibly, which is helpful when dealing with elastic traffic.
-
Grayscale: service for the update of a component. It cannot affect the entire cluster or all traffic. It must be grayscale capable, affecting only some traffic and some instances.
-
Rollback: When a service update is released, it can be rolled back separately if there are some chain reactions.
-
Observability: Enhance component observability from various angles, including logging, logging, metrics and even Opentracing, to maximize the performance of components online.
-
Reduce coupling between components. Each component has independent functions and can be tested and deployed independently. Even if the architecture is well designed, deployment is complex and testing troublesome, which can increase costs.
Following the above points, the new architecture solution looks a bit like Service Mesh control plane/data plane separation and APISIX control plane/data plane separation. Above the dotted line in the middle is the control plane and below is the data plane. The core component of the control plane is called YZ7-Manager. The left side is connected to K8S, and the right side is connected to ETCD. ETCD is its configuration storage center.
The data plane below the dotted line is each instance of YZ7, and each instance has an associated process called yZ7-Agent that does chores. YZ7 is the gateway that retains the core functionality, and the red arrow from bottom to top is the direction of the request.
Control surface core component Manager
-
Manager is a configuration provider similar to Istio Pilot, which before Istio 1.5 was made up of several components, the most important of which was Pilot. The configuration is stored in ETCD. ETCD is characterized by stability and reliability, so ETCD is used for selection.
-
Manager is stateless and can be expanded horizontally.
-
Manager takes over the function of the original K8ssync Controller and watches K8S instead of the function of the original K8S-think. Because manager is stateless and horizontally scalable, it solves the single point problem of YZ7 K8S-Think. At the same time, in the original architecture, the admin server configured in YZ7 is very similar to the current APISIX. Its Admin server is put together with the gateway, but in the new architecture, the gateway admin server is replaced. Only in yZ7-Manager on the control plane.
-
The last core function is to configure the delivery function, which delivers data from the control plane of YZ7-Manager to each data plane.
Agent is the core component of the control plane
The core component of the data plane is the Agent, which is a companion service bound to each instance of the access layer. The core function is responsible for configuration synchronization, including the definition of configuration annotations, which is related to the gray level of configuration. There is also dependency management between configurations. When A and B are configured, A may depend on B, which is equivalent to route and upstream in APISIX. Agent services manage the dependency between configurations.
Access layer YZ7
We removed the admin Server from the original configuration, and removed some of the configuration related code responsible for retrieving data from Redis, leaving only the HTTP interface. We can push configuration externally to a YZ7 instance, keeping it in shared memory. All the original gateway functions are retained, without much modification, only the core functions are retained, simplifying the components.
Design details of the new architecture
After covering the three core components, let’s talk about some of the more important details of the new architecture.
First: from yZ7-Manager on the control plane to YZ7-Agent on the data plane, how to design the configuration delivery protocol to be efficient and reliable?
Second: between YZ7-agent and YZ7, is the data in push mode or pull mode?
Third: How are configuration annotations implemented?
Fourth: how to ensure configuration dependency?
With these four questions in mind, we’ll break them down one by one:
Control plane YZ7-Manager to data plane YZ7-Agent
First of all, our requirements for the protocol must be simple and reliable, otherwise the cost of understanding and development will be high.
Second, the protocol must support active push on the server side, as APISIX’s configuration takes very little time to take effect because ETCD supports The Watch function. The configuration time of Kong is relatively high, because Kong is connected to PostgreSQL and Cassandra, two relational databases that do not support Watch. When data changes on the server, the client can only obtain the data in round robin mode. If the rotation interval is too long, the configuration takes too long to take effect. If the interval is too short, data changes can be captured in time, but the resource consumption is higher.
Based on the above two points, we design a new protocol based on gRPC and reference xDS. After the initial connection, the data on the control plane can be fully obtained. After the connection is maintained for a long time, the data configuration changes on the server can be incrementally obtained.
Above is a clip of gRPC and XDS. The gRPC is the core of configuration synchronization. The two core messages are ConfigRequest and ConfigResponse.
In ConfigRequest, node is the data associated with the instance of a data link, such as the cluster, hostname, IP, etc. Resourcecondition declares configurations of interest on the data side, such as routing, upstream, or cross-domain configurations. Only by declaring all the configurations of interest in the list and telling the server, can the control plane accurately push the configurations of interest to the data plane.
Configresponse is to put the response code, including the error detail and in the case of an error, put all the information, including the error code, into the resource list and push it to the client. Its transmission model is also relatively simple, the client will send the Config request after the connection, and then the server will push all the configuration data to the client for the first time.
When an access layer only pushes some configurations, its configuration amount is not very large, hundreds of megabytes is very much, so full push will not bring much bandwidth and memory overhead, full push is also a low frequency event, do not worry too much about its performance.
Over time, there are new configuration changes on the server, such as a new configuration for operations or a business application release, and pond migrates after the release, causing pond’s endpoints to change. The control plane senses these changes and pushes the data to the Client in real time to complete configuration push from the control plane to the data plane.
This is very similar to xDS protocol. After the discovery request in xDS is sent to the server, if there is data, the data will be pushed back. In Discover Response, if there is no data, a none flag will be added to it. Tell us to get ready to sync the discovery Quest. No data is equivalent to requesting ACQ functionality. We designed a simplified version of the xDS without this functionality.
Data plane YZ7-Agent connects to access layer YZ7
From YZ7-Agent to YZ7, i.e. agent of data plane to instance of data plane, the choice of configuration synchronization is pull or push?
Consider pull first, which has the advantage of loading on demand, loading the corresponding configuration as needed. The disadvantage is that if the configuration provider does not have watch functionality like ECTD, the data needs to be stored in memory and there must be a obsolescence mechanism, otherwise there is no way to get new configuration changes for the same instance. However, if the configuration uses the elimination policy, the problem is that the configuration takes long time to take effect. It takes a long time to take effect, which is irrelevant for some static configurations such as routing and host service configuration. However, for endpoints changes of containerized services, it needs to push data planes as soon as possible, otherwise 5XX errors such as 502 and 504 May occur. Therefore, the pull pattern is not applicable to the new architecture.
The second is push mode. Yz7-agent needs to actively push data to YZ7. The advantage is that YZ7 requires only simple save actions, does not need to consider data expiration, and the combination is less coupled. Such YZ7 can be delivered to the test by adding several interfaces to push the test data needed without additional deployment of YZ7-Agent, which is beneficial to the delivery of the test. The disadvantage is that there is a problem with relying on others to push. If the service is just up or Nginx has just finished hot updates, there is no data in the shared memory. This problem must be solved to use push mode. The agent will periodically dump the data cache to the disk. When the access layer YZ7 instance is hot updated or just started, the old data will be loaded from the disk to ensure normal operation. Moreover, yZ7-Agent is required to push data in full once at this time, so that the latest configuration can be reached immediately.
Configure the implementation of annotations
Configuration notes are designed to do configuration grayscale. If a new configuration takes effect for only one or two small-scale instances in the cluster, it is not required to take effect for all instances in the cluster. Large-scale faults may occur if the configuration is incorrect. Therefore, gray scale configuration can effectively reduce the impact surface of faults.
In the above image, the payload is configured from top to bottom. There is only one server in the configuration data, and antotations is the annotation. The canary field in the annotation can be designed as the required field for gray configuration. This configuration is based on hostsName. This configuration takes effect only on HostS2 or HostS3. Id, name, kind are used to identify the configuration, such as name, type, UUID, etc. In fact, the same is true for K8S’s declarative configuration. The specific configuration is placed on the steak surface, and there will be cloud data related to laybol and so on outside. The Antotations in the figure are the antotations imitating K8S declarative configuration.
Youzan is a SaaS service provider with a large number of domain names. The configuration is very complex and depends on manual configuration. In order to reduce the failure surface caused by human operation error, it is necessary to configure gray scale. The operation process is also very simple. First, create a configuration on the O&M platform and label it as gray configuration, and create relevant configuration annotations at the bottom. After observing the performance of the configuration on the relevant instances and performing OK, the configuration can take effect on all machines, remove the grayscale configuration annotation, then all access layer instances will take effect. If there is a problem, immediately delete gray configuration, also can avoid causing other violent reaction.
Create a grayscale configuration with grayscale annotations. Distribute to each agent through yZ7-Manager. The agent determines whether the configuration is hit or miss on the machine. If it is miss, this configuration will be ignored and will not be pushed. If it’s hit, push it to YZ7 on the machine.
When the gray scale has been normal for a period of time, the configuration can be modified when it needs to take effect completely. After removing the gray scale annotation and pushing it to YZ7-Manager, it will be pushed to each instance of YZ7 unchanged. The one in the lower left corner applies grayscale configuration. Since the name is the same, the configuration of the stable version will replace the configuration of the previous grayscale version, and the configuration of all access layer instances will be the same.
When a configuration problem is found, it is also easy to delete. After the configuration is deleted, since the one in the lower left corner is already grayscale matched, it will push the event of deleting the configuration to YZ7, and then YZ7 will actively delete the copy in memory. The left center and left bottom did not match gray configuration originally, so they will be directly ignored. Thus, the configuration of these three YZ7 instances is restored to the state before gray configuration application.
Configuring dependency Management
Some configurations reference each other. For example, each host can be configured with a standard error page. The error page is a separate configuration. Therefore, the agent of the data plane needs to ensure the push relationship of data configuration. When A configuration depends on B configuration, it cannot push A configuration to the access layer instance first. Because there is A time window between configuration A and CONFIGURATION B, requests that come in between the time window A and B cannot be properly processed.
Summary of Architecture Design
Going cloud native requires us to learn more about good components in cloud native at work, such as K8S, Envoy, and others are excellent examples to study. The new architecture of Uzan access layer follows the design principle of functional separation of control plane and data plane, which refers to the design of Service Mesh. Configuring the delivery protocol is a reference to Envoy, xDS; The annotation function is designed to reference the declarative definition of K8S declarative configuration.
On the way to cloud native, we should look forward to better integrate the functions and new things we need to learn into our work, and better fit the components we use into cloud native, so it will be more meaningful to move to cloud native.
Recommended reading
Technology selection: Why do we choose Flink for batch processing
【 Practical sharing 】 From selection to project landing, rambling on gRPC