Author | Zou Sheng Qunar basic platform technology expert
background
In recent years, cloud native and container technology is very popular, and increasingly mature, many enterprises slowly began to container construction, and constantly explore and practice in the direction of cloud native technology. Based on this general trend, Qunar also took its first step towards cloud native – containerization at the end of 2020.
Cloud native is a set of technical architecture guidelines that empower businesses, enabling applications to scale, scale, portability, and toughness. Cloud native is also a must for the next generation of technology stacks, making businesses more agile. By practicing cloud native stacks such as DevOps, microservices, containerization, observability, Chaos Engineering, ServiceMesh, Serverless, we can reap the benefits of cloud native technology.
Timeline of containerization development of Qunar
The implementation of a new technology in an enterprise is never accomplished overnight, and Qunar’s container implementation is the same. Qunar’s container landing mainly experienced four time nodes:
-
2014-2015:
Students from the business line began to try to solve the difficult problem of setting up a joint environment through Docker and Docker-compose. However, due to the limited arrangement capacity of Docker-compose, it could not solve real environmental problems, so containerization was not implemented in the end.
-
2015-2017:
The OPS team moved the ES cluster to the Mesos platform in order to improve the operation and maintenance efficiency of the ELK cluster. Later, with the maturity of K8s ecosystem, ES cluster was migrated from Mesos to K8s platform, and the operation and maintenance efficiency was further improved.
-
2018-2019:
In the process of increasing business demands, the business has higher requirements on the delivery speed and quality of the test environment. In order to solve the delivery efficiency of MySQL (network IO becomes the bottleneck when there is a large amount of concurrency, resulting in the delivery time of a single instance at the level of minutes), we container MySQL to understand this problem. Docker on Host can deliver a MySQL instance in less than 10 seconds.
-
2020-2021:
Qunar decided to embrace cloud native technology to add momentum to the business. With the concerted efforts of all teams, more than 300 P1 and P2 applications have been containerized, and it is planned that all business applications will be containerized by the end of 2021.
Landing process and practice
Overall containerization scheme introduction
In the process of containerization of Qunar, the Portal platform, middleware, OPS infrastructure and monitoring of each system have been adapted and transformed accordingly. The transformed architecture matrix is shown in the figure below.
-
Portal: The entrance of Qunar’s PaaS platform, providing CI/CD capability, resource management, self-service operation and maintenance, application portrait, application authorization (DB authorization, payment authorization, inter-application authorization) and other functions.
-
O&m tools: Provides observability tools for applications, including Watcher (monitoring and alarm), Bistoury (Java application online debugging), QTrace (tracing system), and Loki/ELK (real-time and offline log viewing).
-
Middleware: all middleware used by the application, including MQ, configuration center, distributed scheduling system Qschedule, Dubbo, mysql SDK, etc.
-
Virtualization cluster: The underlying K8s and OpenStack clusters.
-
Noah: a test environment management platform that supports mixed deployment of KVM and container applications.
CI/CD process transformation
Main renovation points:
-
Application portrait: Converging application-related runtime configuration, whitelist configuration, and publishing parameters to provide a unified declarative configuration for container publishing.
-
Authorization system: Application all authorization operations are carried out through a portal and automated authorization is implemented.
-
K8s multi-cluster solution: Through investigation and comparison, KubeSphere also meets our performance requirements after operation and maintenance optimization and pressure test evaluation. Finally, we select KubeSphere as a multi-cluster solution.
Middleware adaptation and modification
Transformation concerns: Since it is normal for IP to change frequently after containerization, common components and middleware need to adapt and accept this change.
Apply smooth migration scheme design
To help the business move quickly and smoothly to the container, we developed specifications and automated test validation to achieve this goal.
-
Prerequisites for containerization: Stateless applications, no post_offline hook (script executed after the service is offline), and no preheating operation in check_URL.
-
Test environment verification: automatic upgrade SDK, automatic migration. We will help the business to automatically upgrade and change the POM file to complete the upgrade of SDK in the compilation stage, and deploy and verify in the test environment. If the upgrade fails, users will be notified and prompted.
-
Online verification: The first step is to publish online without wiring the flow, and then pass the automatic test verification, and access the online flow after passing the verification.
-
Mixed deployment of online KVM and container: To be safe, the online container and KVM are online for a period of time. After the verification period, the KVM is offline gradually.
-
Full online publishing: Log off the KVM after confirming that the service is running properly.
-
Observe: Observe the KVM for a period of time. If no problem occurs, reclaim the KVM.
Problems encountered in the process of container landing
How to use KVM in the past and support preStart and preOnline Hook custom scripts?
The following describes the hook script usage scenarios in the KVM scenario:
PreStart hook: The user customizes commands in this script, such as environment preparation.
PreOnline hook: The user will define some data warming operations, etc., which need to be performed before the checkURL application passes and the traffic is connected.
Problem:
K8s native provides only preStop and postStart hooks, but their execution timing does not meet the hooks used in the above two KVM scenarios.
Analysis and solution process:
PreStart Hook: The preStart hook phase is injected into entryPoint. A custom preStart script is found during container startup and executed. The location of this script is currently defined in the code specified directory.
PreOnline hook: The preOnline script is executed after the checkURL is passed, and the application container is a single process, so it is not possible to execute this script in the application container. PostStart Hook is designed to be asynchronous and decoupled from the startup of the application container, so we chose postStart Hook to do this in our preliminary solution. The implementation scheme is that postStart hook will continuously poll the health status of the application after execution. If the health check checkURL passes, the preOnline script will be executed. Once the script succeeds, it goes live, creating a healthCheck. HTML file in the application directory. OpenResty and middleware find this file and funnel traffic to this instance.
According to the above scheme, Pod composition design is as follows:
The publishing process cannot read standard input and output
Scenario Introduction:
If the application fails to start during container publishing, we can’t get the real-time standard INPUT/output stream through the K8s API and have to wait for the timeout threshold set by the publishing. During this process, publishers are very anxious because they are not sure what is happening. As shown in the figure below, you can see nothing in the update workflow applied during deployment.
Problem:
Why can’t K8s API get standard I/O?
Analysis and solution process:
Kubectl logs = kubectl logs = kubectl logs = kubectl logs = kubectl logs = kubectl logs = kubectl logs = kubectl logs The problem lies not in the program itself, but in the mechanics of K8s;
If postHook executes for a long time or hangs, the container will also hang and not enter the running state. See this information. Presumably the culprit is the postStart hook.
Based on the above assumptions, the standard input of the application container can be retrieved in real time after removing the postStart hook.
Once the problem has been identified, the solution is simple: put the functionality implemented in postStart Hook into Sidecar. As for how Sidecar creates the healthCheck.html file in the application container’s directory, you need to use the shared volume. The new design is as follows:
With the above solution, the standard input and output of the publishing process, the output of custom hook scripts, Pod events, and so on are visible in real time, making the publishing process more transparent.
Concurrent pull mirror times out
Scenario Introduction:
Our application is multi-machine room and multi-cluster deployment. When the new version of an application is released, due to the large number of application instances, more than 50 concurrent images are pulled from Harbor, some tasks receive an error message about image pull timeout, which leads to the failure of the whole release task. The timeout is the kubelet default of 1 minute.
Analysis and solution:
Through investigation, it was finally confirmed that harbor had performance problems in concurrent image pulling. The optimization scheme we adopted was a general P2P scheme, DragonFly + Harbor.
The authorization interface cannot resist when the concurrency is large
Scenario Introduction:
If the authorization interface fails to be invoked in the application publishing process, the self-healing mechanism of K8s will continue to rebuild the container and reauthorize it, resulting in a large amount of concurrency and finally bringing down the authorization service.
Our container authorization scheme is as follows:
-
The Pod init container starts with a survey of authorized interfaces for authorization operations, including acLs and mysql whitelists.
-
When the container is destroyed, the Sidecar container’s preStop hook is used to reclaim permissions.
Problem:
The ACL authorization interface involves the firewall, and the QPS is relatively low. The ACL authorization of a large number of containers slows down the service.
Analysis and solution:
To solve the above problems, limiting and reducing the number of authorization interface calls is an effective solution. We have taken the following measures:
The number of retries in the init container is limited to one.
Traffic limiting by application and IP address is performed on the authorization interface. If traffic limiting exceeds three times, a failure message is displayed and no authorization operation is performed.
Some common ports involved in the ACL are whitelisted so that applications do not need to perform authorization operations.
How can Java applications support remote debugging in container scenarios
Introduction to DEBUGGING the KVM scenario:
During the development of Java applications, remote debugging is an essential function for developers to quickly locate problems. The Debug process is as follows: Developers click on the Noah environment management platform to enable Debug. Noah automatically configates the Debug option for the Java application. -xdebug -xrunjDWp Transport =dt_socket, server= Y, suspend= N, address=127.0.0.1:50005, and restart the Java application, after which the developer can configure remote Debug in the IDE and go into Debug mode.
Debug schemes for container scenarios:
The Debug mode is enabled by default for Java applications in the test environment, which also avoids the need to change the Debug reconstruction Pod from the KVM minute to the current second. When a user wants to enable debugging, Noah invokes the K8s Exec interface to run soCAT commands for port mapping and forwarding, enabling developers to connect to the Debug port of Java applications through soCAT proxies.
Problem:
In the container scenario, when a request reaches a breakpoint, the Debug function is disabled.
Analysis and solution process:
-
When the DEBUGGING function fails, the system receives a LIVENESS probe failed, kill Pod event. According to this event, the LIVENESS check failed and the application container was killed. The application container restarted the agent process and the Debug failed.
-
When a request reaches a breakpoint, the entire JVM is hung, and any request that comes in is also hung, including checkURL. We also tested the DISTRIBUTION in the KVM scenario and container scenario, and the results were exactly the same.
-
A temporary solution is to Suspend Thread instead of checkURL. The default option in IDEA is Suspend All. However, this is not the optimal solution, because it requires users to manually configure the blocking level and has cognitive learning costs.
-
Back to the initial question, why do WE encounter this problem in the container scenario, but KVM does not? The reason is that the container scenario K8s provides self-healing capability. The K8s periodically performs liVENESS check, and when the number of failures reaches a specified threshold, K8s will kill the container and pull up a new one.
-
The K8s LIVENESS probe supports exec, TCP, and httpGet modes by default. The current ONE is httpGet, which only supports one URL and cannot meet the requirements of this scenario. After group discussion, finally we decided to use this expression (checkurl = = 200) | | (socat process && Java process alive) in as liveness detection way of application, When Debug hits a breakpoint, the application container does not block it, which is a perfect solution to the problem.
The above are the problems we encountered in the process of landing containerization and our solutions. It is important to consider user habits and compatibility with historical functions when migrating KVM to containers. Only in this way can container implementation be smoother.
future
-
Multi-cluster stability management
-
To improve our APM system and improve the efficiency of troubleshooting by making the observable data more comprehensive and more widely covered.
-
Chaos engineering is implemented to verify, discover and eliminate the stability blind areas of containerized scenes.
-
-
Improve resource utilization
-
Elastic capacity expansion and shrinkage based on service specifications.
-
Adjust requests intelligently based on the application’s historical data.
-
-
The ServiceMesh scheme is successfully deployed
- We developed a mesh solution based on Istio and MOSN as well as the current infrastructure. Currently in the testing phase, we believe this solution will make the infrastructure more agile after it is implemented.
This article is published by OpenWrite!