Some of the pictures come from the network, such as assault and deletion

Some of the history

John Scott Haldane proposed in 1895 that because small endotherms exchange their breath faster than humans, toxic gases such as carbon monoxide in mines or asphyxiating gases such as methane would affect them first.

For example, at the same concentration of carbon monoxide, a mouse would be exposed to the effects of carbon monoxide in a matter of minutes, while it would take 20 times longer for humans to be affected, so mice have been used as an early warning species for toxic gases underground since around 1896.

Over time, it was found that the creatures were more sensitive to toxic gases. In 1900 records began to show that some mines began using canaries as an early warning of toxic gases underground.

Later, a special cage was developed for canary gas detection so that the birds could be reused. The cage is actively oxygenated and has an air vent in front that can be opened and closed through a closed window. Leave the air vent open when canary warning is needed. If the canaries in the cage are gassed, close the vent window and fill the cage with oxygen. If the canary had not been poisoned, it might have come back to life.

Because of the continuous development of science and technology, toxic gas detector was invented. This method of detection by life began to fade away. It wasn’t until 1986 that Britain and the United States stopped using canaries as warning creatures altogether.

Canary deployed this way of deployment of the goal and the logic and use canary to early warning is very similar to vent (through the switch/traffic way to control hazards and resilience), I guess may be you also hope to be able to mark in the 20th century as the lives of miners golden birds, so this way is the name of the champions league on the canary.

Now that we know where the name came from, let’s begin to understand what canary deployment really looks like

The basic definition

Canary deployment is testing the changes out to a small number of users before rolling them out to the entire service cluster and making them available to everyone. During the testing process, the state of each dimension of the tested service is continuously observed to verify the robustness, availability and stability of the new version.

When the verification results meet the desired goals, you can gradually deploy the new version to more servers and make it available to more users.

advantage

  • Subzero line time with fast rollback:
    • After a series of relevant verification and testing, it is easy to roll back and control the scope of influence if a new version of the software is deemed unsuitable.
  • Tests in real scenarios:
    • Because new releases are deployed directly to production for testing, they can be verified against real traffic. Of course, you need to limit traffic and users to control the scope and impact of authentication.
  • Lower infrastructure costs:
    • Because the Canary deployment strategy is based on rules that require routing or routing (e.g., user name, region, age, randomness, etc.), validation can be achieved with a small amount of additional infrastructure. Compared with the blue-green deployment strategy, the same set of infrastructure needs to be prepared as the production environment, and the deployment cost is significantly higher.
  • Flexibly verify the correctness of related versions and functions on demand.
    • Request traffic can be routed and routed in multiple dimensions according to different features and identifiers to achieve flexible authentication with different granularity and features.

Is not a silver bullet

Although Canary deployment can provide strong support and assistance for your deployment. But there are no silver bullets in software engineering. Canary deployment also requires caution in many scenarios:

  • Systems that are strictly error-free, such as medical systems, fire systems, etc
  • The data structure to be deployed cannot be downward-compatible with the current data structure
    • Although Ali Cloud MDS can provide shunt shadow database ability. However, when used by regular users, it still affects the actual data and behavioral experience. Therefore, it is limited to testers or internal user experience.
  • Non-automated Canary deployment is both time consuming and error prone. Therefore, we should use it in an automated way whenever possible, rather than manually maintaining triage policies and logic.
  • And many other scenarios that have strict requirements on the production environment are not recommended

Instead, let’s look at some discussions that have nothing to do with the details.

Canary Release or Canary Deployment?

In communication and spoken English, people are used to mixing release and deployment. It is common to see terms such as blue-green deployment, grayscale publishing (preferred to grayscale than canary), and rolling publishing listed together to indicate that they are all different ways of publishing.

Let’s get started.

[disposition, disposition] disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition, disposition Arrange the arrangement of the artillery is marked out on this map according to the two dictionary definitions above, I understand it as follows:Copy the code

In a broad sense, publishing refers to a process in which thoughts, opinions, articles and opinions are released to the public through newspapers, books, the Internet or public speeches. In the context of computers, it is a way to put a particular piece of software in a place that everyone can access, passively allowing people to update or synchronize it. For example: I packaged a version of the software and released it.

Deployment broadly refers to the arrangement or execution of people, plans, tasks, etc. In the computer category, it is the installation or update of specific software into the corresponding environment, so that it can provide services to users. For example: I have deployed the new version to the test environment, you test it.

Based on the above explanation, I think it is more accurate to use Canary Deployment.

Relationship to A/B testing

When I first learned about Canary deployment, MANY places confused A/B testing with Canary deployment, or even treated the two as one thing. In fact, when you look more closely, it turns out that they were connected before, but not to the point where you can draw an equal sign.

In common

Their network traffic processing logic is very similar. They both determine the version of traffic to be served based on different traffic characteristics.

Canary deployment can be used as part of the technical foundation for implementing A/B testing; But don’t confuse them.

The difference

The purpose of the two is quite different, canary deployment is used to detect problems and regression functions, and A/B testing is A method used to test business design assumptions. From the point of view of their goals, they test and observe methods are different.

Canaries are typically deployed using more technology-oriented observation tools (APM, log monitoring, etc.). Technical/developer observations are made for new versions of services that need to be deployed. When the observations are in line with expectations, the technician can proceed further. Otherwise, you need to solve the problem first, and then deploy and observe until expectations are met, and then proceed to the next step.

If we look at the business in the same way and with the same tools, we can’t get accurate results. Even the people who are observing have different functions. In general, A/B test requires A prior burial point for observation data collection for the observation portion. Then, the collected business data are sorted out and counted, and finally the relevant data analysis results and statistical results are obtained to provide business personnel with the analysis and judgment of new business.

Finally, from A more time perspective, it may take days for A business person to collect enough data to demonstrate significant A/B testing, while A technical person wants Canary deployment to be completed in minutes or hours.

With so much talk about implementation unrelated topics, let’s take a look at some of the ways to implement canary deployment.

implementation

Realize the structure

The key points of canary deployment are as follows:

  • Network Traffic Shunting
  • Traffic Diversion Policy Management
  • Multi-application shunt transfer strategy
  • Data compatibility processing

The key points in the figure above can be clarified by the following figure:

Let’s start with the process and see what the execution of a Canary deployment looks like.

The specific process

Starting with the process, it is easy to see what characteristics canary deployment presents, what capabilities it provides, and so on, at different stages.

  • Normal: Only 1-N authenticated versions are available in the current environment, providing services for all users.

  • Authentication phase: There are 2 to N versions in the current environment. One version is authenticated and provides stable services for most users, while the other versions (1 to N-1) provide unstable services for certain users (randomly selected in some scenarios). At the same time, continuously verify the health of the version to determine whether it needs to be deployed or processed.

  • Deployment phase: After a version passes the verification, the version will be deployed to the planned service cluster. If, further testing is required, the verification period is returned for verification. Until the final required version deployment ratio is verified. A rolling deployment approach is often used.

implementation

The main ones mentioned here are server-side implementations.

I divide them into the following types based on their implementation:

  • Infrastructure implementation (IAAS) : for example, through aliyun MDS tool
  • Platform implementation (PAAS): Implemented through K8S Ingress component or Istio
  • Through Nginx and other middleware implementation: directly in Nginx through script control traffic forwarding rules
  • .

Because of this, the implementation details of the Canary deployment itself are tightly coupled to the application scenario. The specific implementation method is not developed here. Pay attention to its features and considerations mentioned in the article (not a silver bullet) as well as the specific execution process and structure to deploy in the process of use. I’ve done more here, and I hope you’ll have more interesting things and thoughts to discuss about canary Deployment.

The last

One day in the 20th century, in a dark, narrow mine. With a cage in one hand and one foot in the other, the miners in the headlamp struggle to crawl forward, sweat coated in black powder running down their faces. The beautiful bird struggles to balance in its rickety cage, unsure of the sunlight or the choking fumes that greet his near-escape.

The last

Thanks to colleagues and friends for their suggestions and ideas in writing this article