Reduce deployment risk by combining Linkerd and Flagger to automate canary publishing based on service metrics.

Linkerd 2.10 中文 版

  • linkerd.hacker-linner.com/

Linkerd 2.10 series

  • Linkerd V2 Service Mesh
  • Tencent Cloud K8S deployment Service Mesh — Linkerd2 & Traefik2 deployment emojivoto application
  • Learn about the basic features of Linkerd 2.10 and step into the era of Service Mesh
  • Linkerd 2.10(Step by Step) — 1. Add your service to Linkerd

Linkerd’s traffic split feature allows you to dynamically transfer traffic between services. This can be used to implement low-risk deployment strategies such as blue-green deployments and Canaries.

But simply moving traffic from one version of the service to the next is just the beginning. We can combine traffic splitting with Linkerd’s automated Golden Metrics telemetry and drive traffic decisions based on what we observe. For example, we can gradually move traffic from an old deployment to a new one, while continuously monitoring its success rate. If the success rate drops at any point, we can divert traffic back to the original deployment and exit the release. Ideally, our users remain happy and don’t notice anything!

In this tutorial, we walk you through how to use Linkerd in conjunction with Flagger, an incremental delivery tool that ties Linkerd’s metrics and traffic splitting into a control loop for a fully automated, metrics-aware Canary deployment.

A prerequisite for

  • To use this guide, you need to install it on a clusterLinkerdAnd itsVizExtension.

If you have not already done so, follow the Linkerd Installation guide.

  • FlaggerThe installation depends onkubectl1.14 or later.

Install the Flagger

Linkerd will manage the actual traffic routing, while Flagger will automate the process of creating new Kubernetes resources, watching metrics, and gradually sending users to the new version. To add Flagger to your cluster and configure it for use with Linkerd, run:

kubectl apply -k github.com/fluxcd/flagger/kustomize/linkerd
# customresourcedefinition.apiextensions.k8s.io/alertproviders.flagger.app created
# customresourcedefinition.apiextensions.k8s.io/canaries.flagger.app created
# customresourcedefinition.apiextensions.k8s.io/metrictemplates.flagger.app created
# serviceaccount/flagger created
# clusterrole.rbac.authorization.k8s.io/flagger created
# clusterrolebinding.rbac.authorization.k8s.io/flagger created
# deployment.apps/flagger created
Copy the code

This command adds:

  • Canary

CRD can configure how to publish.

  • RBACawardedFlaggerChange the permissions for all resources it needs, such as deployment (deployments) and services (services).
  • Configured withLinkerdA controller that controls plane interactions.

To observe until everything works, you can use Kubectl:

kubectl -n linkerd rollout status deploy/flagger
# Waiting for deployment "flagger" rollout to finish: 0 of 1 updated replicas are available...
# deployment "flagger" successfully rolled out
Copy the code

Set up the demo

The demo consists of three components: load Generator, Deployment, and frontend. The deployment creates a POD that returns some information, such as a name. You can use Responses to observe incremental deployment as Flagger orchestrates it. Because some kind of active traffic is required to complete the operation, the load generator can perform the deployment more easily. The topology of these components is as follows:

To add these components to your cluster and include them in the Linkerd data plane, run:

kubectl create ns test && \
  kubectl apply -f https://run.linkerd.io/flagger.yml
# namespace/test created
# deployment.apps/load created
# configmap/frontend created
# deployment.apps/frontend created
# service/frontend created
# deployment.apps/podinfo created
# service/podinfo created
Copy the code

Verify that everything started successfully by running the following command:

kubectl -n test rollout status deploy podinfo
# Waiting for deployment "podinfo" rollout to finish: 0 of 1 updated replicas are available...

# deployment "podinfo" successfully rolled out
Copy the code

Check it by forwarding the front-end service locally and opening it via http://localhost:8080 running locally:

kubectl -n test port-forward svc/frontend 8080
Copy the code

I’ll just add IngressRoute to make it easier to see a real demo.

ingress-route.yaml

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: podinfo-dashboard-route
  namespace: test
spec:
  entryPoints:
    - websecure
  tls:
    secretName: hacker-linner-cert-tls
  routes:
    - match: Host(`podinfo.hacker-linner.com`)
      kind: Rule
      services:
        - name: frontend
          port: 8080
Copy the code

You can go directly to podinfo.hacker-linner.com.

Traffic diversion occurs on the connected client side rather than the server side. Any requests from outside the grid will not be diverted and will always be directed to the main back end. A service of type LoadBalancer will exhibit this behavior because the source is not part of the grid. To divert external traffic, add the entry controller to the grid.

Configuration to release

Before changing anything, you need to configure how the publication should be rolled out on the cluster. This configuration is included in the Canary definition. To apply to your cluster, run:

cat <<EOF | kubectl apply -f - apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec: targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo service: port: 9898 analysis: interval: 10s threshold: 5 stepWeight: 10 maxWeight: 100 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 1m - name: request-duration thresholdRange: max: 500 interval: 1m EOF
Copy the code

The Flagger controller is monitoring these definitions and will create new resources on the cluster. To observe this process, run:

kubectl -n test get ev --watch
Copy the code

A new deployment named Podinfo-primary will be created with the same number of copies as podinfo has and the original deployment will be reduced to zero once the new Pod is ready. This provides deployment managed by Flagger as an implementation detail and maintains your raw profile and workflow. When you see the following line, everything is set:

0s          Normal    Synced                   canary/podinfo                          Initialization done! podinfo.test
Copy the code

In addition to the managed deployment, services have been created to coordinate routing traffic between the old and new versions of the application. These can be viewed using kubectl -n test get SVC and should look like this:

NAME TYPE cluster-ip external-ip PORT(S) AGE frontend ClusterIP 10.7.251.33 < None > 8080/TCP 96M Podinfo ClusterIP 10.7.252.86 <none> 9898/TCP 96M podinfo- Canary ClusterIP 10.7.245.17 < None > 9898/TCP 23M podinfo- Primary ClusterIP 23 m 10.7.249.63 < none > 9898 / TCPCopy the code

At this point, the topology looks something like:

This guide doesn’t cover all of the features Flagger offers. Be sure to read the documentation if you are interested in integrating the Canary version with HPA, working with custom metrics, or doing other types of release releases (such as A/B testing).

Start to rollout

As a system, Kubernetes Resources has two main parts: spec and Status. When the controller sees the spec, it does everything it can to match the status of the current system to the spec. With deployment, if any POD specification configuration changes, the controller starts rollout. By default, the Deployment Controller coordinates rolling Update.

In this example, Flagger notices that the spec of the deployment has changed and starts orchestrating canary rollout. To start this process, you can update the image to a new version by running the following command:

kubectl -n test setImage deployment/podinfo \ podinfod = quay. IO/stefanprodan/podinfo: 1.7.1Copy the code

Any modification to the POD specification (such as updating environment variables or annotations) results in the same behavior as updating an image.

When updated, Canary Deployment (Podinfo) will expand (scaled up). Once ready, Flagger will begin phasing in updates to TrafficSplit CRD. Configure stepWeight to 10, and each increment of the podinfo weight will increase by 10. For each cycle, the success rate is observed, and as long as the 99 percent threshold is exceeded, the Flagger continues to rollout. To see the whole process, run:

kubectl -n test get ev --watch
Copy the code

When an update occurs, the resources and traffic will look like this at a higher level:

When the update is complete, this graph will change back to the one in the previous section.

You can toggle the image TAB between 1.7.1 and 1.7.0 to start publishing again.

Resource

Canary Resource updates the current status and progress, which you can view by running the following command:

watch kubectl -n test get canary
Copy the code

Behind the scenes, Flagger is splitting the traffic between the primary and Canary backends by updating the traffic split Resource. To see how this configuration changed during roll-out, run:

kubectl -n test get trafficsplit podinfo -o yaml
Copy the code

Each increment increases the weight of podinfo-canary and decreases the weight of podinfo-primary. Once deployed successfully, the weight of Podinfo-primary is reset to 100, and the underlying Canary deployment (Podinfo) is scaled back.

indicators

As traffic moves from primary to Canary deployments, Linkerd provides visibility into what is happening at the request destination. These metrics show the back-end to receive traffic in real time and measure success rate, latencies, and throughput. On the CLI, you can run the following command:

watch linkerd viz -n test stat deploy --from deploy/load
Copy the code

For something more intuitive, you can use dashboards. Start it by running the Linkerd Viz Dashboard and then view the Podinfo traffic split details page.

The browser

Visit http://localhost:8080 again. Refreshing the page will show a switch between the new version and different title colors. Alternatively, running curl http://localhost:8080 returns a JSON response similar to the following:

{
  "hostname": "podinfo-primary-74459c7db8-lbtxf"."version": "1.7.0"."revision": "4fc593f42c7cd2e7319c83f6bfd3743c05523883"."color": "blue"."message": "'t from podinfo v1.7.0"."goos": "linux"."goarch": "amd64"."runtime": "go1.11.2"."num_goroutine": "6"."num_cpu": "8"
}
Copy the code

This response will slowly change as the rollout continues.

Clean up the

To clean up, remove the Flagger controller from the cluster and remove the Test namespace by running the following command:

kubectl delete -k github.com/fluxcd/flagger/kustomize/linkerd && \
  kubectl delete ns test
Copy the code