Wechat official account: Operation and maintenance development story, author: Jock

In our daily work, we often launch and upgrade applications, especially in Internet companies, mainly to meet the rapid business development. The publishing methods we often use are rolling updates, blue-green publishing and grayscale publishing.

  • Rolling update: Replace the old with the new until all the old ones have been replaced.

  • Blue-green release: two sets of independent systems, the external service is called the green system, the service to be online is called the blue system, when the application test in the blue system is completed, user traffic access to the blue system, the blue system will be called the green system, the previous green system can be destroyed.

  • Grayscale release: there are two versions, stable and grayscale, in a set of clusters. The grayscale version can only be available to some people. After the grayscale version test is completed, the grayscale version can be upgraded to the stable version, and the old stable version can be offline, which is also called Canary release.

Ingress-nginx Controller is used to implement grayscale publishing.

The outline of this paper is as follows.

How to implement grayscale publishing with Ingress-nginx

Ingress-nginx is the official Kubernetes recommended Ingress Controller. It is based on the Nginx implementation, adding a set of Lua plug-ins for additional functionality.

In order to achieve grayscale publishing, Ingress-Nginx defines annotations to achieve grayscale publishing of different scenes, which supports the following rules:

  • Nginx. Ingress. Kubernetes. IO/canary – by – the header: based on the Request header flow segmentation, suitable for gray conference and A/B testing. When the Request Header is set to always, requests are sent all the way to the Canary version; When the Request Header is set to never, the Request is not sent to the Canary port; For any other Header value, the Header is ignored and the request is compared to other canary rules by priority.

  • Nginx. Ingress. Kubernetes. IO/canary – by – the header – value: to match the Request of the value of the header is used to notify the ingress routes the Request to the canary ingress of the specified services. When the Request Header is set to this value, it is routed to the Canary entry. This rule allows the user to customize the value of the Request Header and must be used with the previous annotation (i.e., canary-by-header).

  • Nginx. Ingress. Kubernetes. IO/canary – weight: weights based on the service flow segmentation, apply to the blue green deployment, according to the percentage weight range of 0-100 routes the request to the canary ingress of the specified services. A weight of 0 means that the Canary rule does not send any requests to the service in the Canary entry. A weight of 100 means that all requests will be sent to the Canary portal.

  • Nginx. Ingress. Kubernetes. IO/canary – by – cookies: flow segmentation based on cookie, applicable to the gray level distribution and A/B testing. Cookie used to tell the Ingress to route the request to the service specified in the Canary Ingress. When cookie is set to always, it is routed to Canary. When cookie is set to never, requests are not sent to the Canary port; For any other value, cookies are ignored and the request is prioritized against other canary rules.

We also realize grayscale publishing through the above annotation, and the idea is as follows:

  1. Deploy two systems in a cluster. One is stable version and the other is Canary version. Both versions have their own services

  2. Define two ingress configurations, one for providing services normally and one for adding annotations to canary

  3. After the canary version is correct, switch it to stable version, and take the old version offline, and all traffic is connected to the new stable version

Introduction to Release Scenarios

The above introduction of ingress-nginx grayscale publishing method and our own implementation ideas, here to discuss what grayscale publishing scenarios.

Weighted publishing scenarios

If the application has been running for A foreign services, the development of Bug fixes, need to release A2 version will be online, but we don’t want to direct all traffic access to new A2 version, but want to 10% of the traffic into the A2, after being A2 stability, will all traffic access, then logoff A version of the original.

image.png

To do this, just add the following annotation to the ingress of the Canary.

nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"

Copy the code

The nginx. Ingress. Kubernetes. IO/canary said open canary, nginx. Ingress. Kubernetes. IO/canary – weight said we set the weight of size.

Publishing scenarios based on user requests

The weighted publishing scenario is crude, at 20% of all users, and does not limit specific users.

We sometimes have such demand, such as guangdong, Beijing, sichuan, the three areas of users, and already have A version of the application of these three areas to provide services, as updated demand, we need to launch A2 application, but we don’t want to access application A2 all regions, but hope only sichuan users can access, After the feedback from Sichuan area is no problem, other areas will be opened.

image.png

For this we need to add the following annotation to the ingress of the Canary.

nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-by-header: "region"
nginx.ingress.kubernetes.io/canary-by-header-value: "sichuan"

Copy the code

The above two release scenarios are the main ones, which will be tested separately in the following sections.

Grayscale release concrete implementation

I have prepared two images here, one is a stable version and the other is a grayscale Canary version.

  • registry.cn-hangzhou.aliyuncs.com/rookieops/go-test:v1

  • registry.cn-hangzhou.aliyuncs.com/rookieops/go-test:v2

Since the configuration of the two scenarios is inconsistent only at the ingress, the rest is the same, so both versions of the application are deployed first.

(1) Stable version

apiVersion: apps/v1 
kind: Deployment
metadata:
  name: app-server-stable
spec:
  selector:
    matchLabels:
      app: go-test
      version: stable
  replicas: 1
  template:
    metadata:
      labels:
        app: go-test
        version: stable
    spec:
      containers:
      - name: app-server
        image: registry.cn-hangzhou.aliyuncs.com/rookieops/go-test:v1
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: app-server-stable-svc
spec:
  selector:
    app: go-test
    version: stable
  ports:
  - name: http
    port: 8080

Copy the code

The access effect is as follows:

# curl 10.97.112.137:8080
{"data":"hello world","version":"v1"}

Copy the code

(2) Canary version

apiVersion: apps/v1 
kind: Deployment
metadata:
  name: app-server-canary
spec:
  selector:
    matchLabels:
      app: go-test
      version: canary
  replicas: 1
  template:
    metadata:
      labels:
        app: go-test
        version: canary
    spec:
      containers:
      - name: app-server
        image: registry.cn-hangzhou.aliyuncs.com/rookieops/go-test:v2
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: app-server-canary-svc
spec:
  selector:
    app: go-test
    version: canary
  ports:
  - name: http
    port: 8080

Copy the code

The access effect is as follows:

# curl 10.110.178.174:8080 {"data":"hello SB","version":"v2"} # curl 10.110.178.174:8080 {"data":"hello SB","version":"v2"}Copy the code

With the application deployed above, we will test the weight and user request scenarios.

Weighted publishing scenarios

(1) Configure the ingress of stable

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app-server-stable-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: joker.coolops.cn
    http:
      paths:
      - path:
        backend:
          serviceName: app-server-stable-svc
          servicePort: 8080

Copy the code

(2) Configure the Canary ingress

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app-server-canary-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
  rules:
  - host: joker.coolops.cn
    http:
      paths:
      - path:
        backend:
          serviceName: app-server-canary-svc
          servicePort: 8080

Copy the code

Then we passed the access test, which looked like this:

# curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello SB","version":"v2"} #  curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"} # curl joker.coolops.cn {"data":"hello world","version":"v1"}Copy the code

The ratio is about 9:1.

Publishing scenarios based on user requests

(1) Configure the ingress of stable

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app-server-stable-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: joker.coolops.cn
    http:
      paths:
      - path:
        backend:
          serviceName: app-server-stable-svc
          servicePort: 8080

Copy the code

(2) Configure the Canary ingress

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app-server-canary-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-by-header: "region"
    nginx.ingress.kubernetes.io/canary-by-header-value: "sichuan"
spec:
  rules:
  - host: joker.coolops.cn
    http:
      paths:
      - path:
        backend:
          serviceName: app-server-canary-svc
          servicePort: 8080

Copy the code

When we access without the header, only the stable version of the application will be accessed as follows:

# curl joker.coolops.cn
{"data":"hello world","version":"v1"}
# curl joker.coolops.cn
{"data":"hello world","version":"v1"}

Copy the code

If we visit with the Region: Sichuan header, we will only access the Canary version of the application, as follows:

# curl joker.coolops.cn -H "region: sichuan"
{"data":"hello SB","version":"v2"}
# curl joker.coolops.cn -H "region: sichuan"
{"data":"hello SB","version":"v2"}

Copy the code

Is the implementation simple?

Now let’s think about another question, all the above operations are manual, how should we automate? How to design the assembly line?

Here’s what I personally think.

Ideas about grayscale release pipeline design

Let’s start by stroking:

  1. Release the Canary version of the application for testing

  2. The test completes and replaces the Canary version with the stable version

  3. Delete the ingress configuration for the Canary version

  4. Delete the old stable version

The whole process is simple, but it is not allowed to change labels directly in a deployed Deployment. Is it possible to update the stable version image after canary version test I is complete? Of course, there is a rolling update process in this case.

Then our assembly line can be designed as follows:

One problem with this design is that it is not possible to determine the waiting time. If the waiting time is too long, it will not only consume resources, but also automatically timeout out.

Can we split it into two lines? The process is as follows:

I prefer the second option, where the pipeline runs and exits without taking up additional resources.

Before developing the pipeline, we need to define a naming standard so that it is easier to operate.

  1. The pipeline name format is as follows:

  2. <APP_NAME>-stable

  3. <APP_NAME>-canary

  4. The deployment name format is as follows:

  5. <APP_NAME>-stable

  6. <APP_NAME>-canary

  7. The name of a service is in the following format:

  8. <APP_NAME>-stable-svc

  9. <APP_NAME>-canary-svc

  10. The ingress name format is as follows:

  11. <APP_NAME>-stable-ingress

  12. <APP_NAME>-canary-ingress

Once the standard is defined, it is much easier to implement.

Code location: gitee.com/coolops/gar…

I define two jenkinsfiles, one called canary.Jenkinsfile and one called stable.Jenkinsfile, which are used to deploy the Canary and stable versions respectively.

Then we will create two pipelines as follows:Among themjoker-gary-devops-canaryThis is used to deploy the Canary version and the other is used to deploy the Stable version.

Stable versions are now running in the cluster as follows:

# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}Copy the code

We changed the requirements and changed the code as follows:

package main import ( "net/http" "github.com/gin-gonic/gin" ) func main() { g := gin.Default() g.GET("/", func(c *gin.Context) { c.JSON(http.StatusOK, gin.H{ "version": "v1", "data": "hello Joker!" , }) }) _ = g.Run(":8080") }Copy the code

Canary pipeline is released first, and when the pipeline is released, you can see canary version pod and ingress in the cluster, as follows:

# kubectl get po| grep canary gray-devops-canary-59c88846dc-j2vlc 1/1 Running 0 55s # kubectl get svc| grep canary Gray - enterprise - canary - SVC ClusterIP 10.233.18.235 < none > 8080 / TCP # 3 h14m kubectl get ingress | grep canary Grey-devops-canary -ingress joker.coolops.cn 192.168.100.61 80 63sCopy the code

Check the content of canary-ingress to see if it is what we need, as follows:

# kubectl get ingress gray-devops-canary-ingress -o yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: kubernetes.io/ingress.class: nginx nginx.ingress.kubernetes.io/canary: "true" nginx.ingress.kubernetes.io/canary-weight: "10" creationTimestamp: "2022-02-15T05:43:32Z" generation: 1 name: gray-devops-canary-ingress namespace: default resourceVersion: "412247041" selfLink: /apis/extensions/v1beta1/namespaces/default/ingresses/gray-devops-canary-ingress uid: fe13b38d-1f6f-45fb-8d89-504b4b8288ea spec: rules: - host: joker.coolops.cn http: paths: - backend: serviceName: Gray -devops-canary- SVC servicePort: 8080 status: loadBalancer: ingress: -ip: 192.168.100.61Copy the code

You can see it’s exactly what we planned.

Access tests are also fine, as follows:

# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello joker!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello joker!" ,"version":"v1"}[root@master ~]# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello world!" ,"version":"v1"}Copy the code

Now you are ready to release stable versions, running the pipeline of stable versions. After publishing, there are only stable versions of applications in the cluster, as follows:

# kubectl get po | grep gray gray-devops-stable-7f977bb6cf-8jzgt 1/1 Running 0 35s # kubectl get ingress | grep gray Gray-devops-stables -ingress joker.coolops.cn 192.168.100.61 80 111mCopy the code

Access by domain name is also expected.

# curl -h "Host: joker.coolops.cn" http://192.168.100.61 {"data":" Hello joker!" ,"version":"v1"}Copy the code

So far I have basically realized my idea.

Note: The user name and password involved in Jenkinsfile are stored in Jenkins credentials. The plugin needs to install kubernetes deploy plug-in, and search for it in the plugin center.

The last

Above we basically implemented the grayscale publishing process, but only changed the manual to automatic. But do you see any problems?

First of all, we need to switch the pipeline to publish. Secondly, the release control is not very friendly. For example, we need to manually add nodes of canary version.

In fact, I prefer to use Argo-Rollouts and ArgoCD for grayscale publishing. Argo-rollouts has a customized CRD for controlling the publishing process, which can save a lot of manual operation. Argocd is a set of software implemented based on GITOPS, which is convenient for us to control CD. UI panels are also provided for manipulation. However, using this method requires changes to existing distribution methods and application templates, which are not complicated, but involve some risk and require a degree of testing.


I am Jock, a member of the public account team of “Operation and Maintenance Development Story”. I am a front-line operation and maintenance migrant worker and a cloud native practitioner. There are not only core technology dry goods here, but also our thoughts and feelings on technology.