Use Flux+Flagger+Istio+Kubernetes to combat GitOps Cloud native progressive (Canary) delivery

In this guide, you will gain hands-on experience with Progressive Delivery using Kubernetes and Istio using GitOps.

introduce

gitops-istio

What are GitOps?

GitOps is a way of continuous delivery that uses Git as a real source of declarative infrastructure and workloads. For Kubernetes, this means using Git push instead of Kubectl apply/delete or helm install/upgrade.

In this Workshop, you will use GitHub to host the configuration repository and deliver Flux as a GitOps solution.

What is incremental delivery?

Progressive delivery is the umbrella term for advanced deployment patterns such as Canary Canaries, feature Flags, and A/B testing. Incremental delivery reduces the risk of introducing new software versions into production by giving application developers and SRE teams fine-grained control over blast radius.

In this workshop, you will use Flagger and Prometheus to automate Canary publishing and A/B Testing for your application.

The preparatory work

You will need Kubernetes cluster v1.16 or higher with LoadBalancer support. For testing purposes, you can use Minikube with 2 cpus and 4GB of memory.

Use Homebrew to install flux CLI:

brew install fluxcd/tap/flux
Copy the code

Binaries for macOS AMD64/ARM64, Linux AMD64/ARM, and Windows are available for download at flux2 Release Page.

Verify that your cluster meets the prerequisites:

flux check --pre
Copy the code

Install JQ and YQ using Homebrew:

brew install jq yq
Copy the code

Fork the repository and clone it:

git clone https://github.com/<YOUR-USERNAME>/gitops-istio
cd gitops-istio
Copy the code

Cluster bootstrap

Using the Flux bootstrap command, you can install Flux on a Kubernetes cluster and configure it to manage itself from a Git repository. If Flux components exist on the cluster, the bootstrap command will perform the upgrade as needed.

Boot Flux by specifying your GitHub repository fork URL:

flux bootstrap git \
  --author-email=<YOUR-EMAIL> \
  --url=ssh://[email protected]/<YOUR-USERNAME>/gitops-istio \
  --branch=main \
  --path=clusters/my-cluster
Copy the code

The command above requires ssh-agent, see the Flux Bootstrap Github documentation if you are using Windows.

At boot time, Flux generates an SSH key and prints the public key. To synchronize your cluster status with Git, you need to copy the public key and use write to create a deploy key to access your GitHub repository. On GitHub, go to Settings > Deploy Keys, click Add Deploy Key, check ☑️ Allow Write Access, paste Flux public key, and click Add Key.

When Flux visits your repository, it does the following:

The installationIstio operator
Waiting for theIstioControl plane ready
The installationFlagger,Prometheus 和 Grafana
createIstioCommon gateway
createprodThe namespacenamespace
Create a load tester (load tester) deployment
Create a front end (frontend) deploymentAnd the canarycanary
Create a backend (backend) deploymentAnd the canarycanary

When booting a cluster using Istio, it is important to define the apply order. For application Pods to use Istio Sidecar injection, the Istio control plane must be up and running before the application.

In Flux V2, you can specify the order of execution by defining dependencies between objects. For example, in clusters/my-cluster/apps.yaml we tell Flux that the coordination of apps depends on an istio-system:

apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 30m0s
  dependsOn:
    - name: istio-system
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps
Copy the code

First observe Flux install Istio, then observe Demo Apps:

watch flux get kustomizations
Copy the code

You can trace the Flux Reconciliation log using the following command:

flux logs --all-namespaces --follow --tail=10
Copy the code

Istio customization and upgrades

You can use in istio/system/profile. The yaml IstioOperator resources custom istio installation:

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: istio-default
  namespace: istio-system
spec:
  profile: demo
  components:
    pilot:
      k8s:
        resources:
          requests:
            cpu: 10m
            memory: 100Mi
Copy the code

After changing the Istio Settings, you can push the changes to Git and Flux will apply it on the cluster. The Istio Operator will reconfigure the Istio control plane based on your changes.

When a new Istio version becomes available, the update-Istio GitHub Action Workflow workflow opens a pull Request containing the list updates required to upgrade the Istio Operator. The new Istio version is tested on Kubernetes Kind with E2E Workflow, and Flux will upgrade Istio within the cluster when PR is merged into the main branch.

Application boot

When Flux synchronizes Git repositories with your cluster, it creates front-end/back-end deployment (Frontend/Backend Deployment), HPA, and a canary object. Flagger uses the Canary definition to create a series of objects: Kubernetes deployments, ClusterIP Services, Istio Destination Rules and Virtual Services. These objects expose applications on the mesh and drive Canary Analysis and promotion.

# applied by Flux
deployment.apps/frontend
horizontalpodautoscaler.autoscaling/frontend
canary.flagger.app/frontend

# generated by Flagger
deployment.apps/frontend-primary
horizontalpodautoscaler.autoscaling/frontend-primary
service/frontend
service/frontend-canary
service/frontend-primary
destinationrule.networking.istio.io/frontend-canary
destinationrule.networking.istio.io/frontend-primary
virtualservice.networking.istio.io/frontend
Copy the code

Check to see if Flagger successfully initializes canary:

kubectl -n prod get canaries

NAME       STATUS        WEIGHT
backend    Initialized   0
frontend   Initialized   0
Copy the code

When frontend primary is deployed online, Flager routes all traffic to the primary Pod and deploys frontend scale to zero.

Use the following command to find the address of the ingress gateway:

kubectl -n istio-system get svc istio-ingressgateway -ojson | jq .status.loadBalancer.ingress
Copy the code

Open your browser and navigate to the entry address, and you’ll see the front-end UI.

Published by Canary

Flagger implements a control loop that gradually diverts traffic to Canary while measuring key performance metrics such as HTTP request success rate, average request duration, and POD health. Based on the analysis of the KPIs, canary is upgraded or suspended and the results of the analysis are published to Slack.

Canary analysis is triggered by changes to any of the following objects:

The deployment ofPodSpec(Container image, command, port, environment, etc.)
ConfigMaps 和 SecretsAs the volume (volumesMount or map to environment variables

For workloads that do not receive constant traffic, The Flagger can configure a Webhook that, when called, launches a load test for a target workload. The canary configuration can be found on apps/backend/canary. Yaml.

Pull changes from GitHub:

git pull origin main
Copy the code

To trigger the canary deployment of the back-end application, collision the container image:

yq e '. '[0]. NewTag = "5.0.1"' -i ./apps/backend/kustomization.yaml
Copy the code

Commit and push changes:

git add -A && \
git commit -m "Backend 5.0.1." " && \
git push origin main
Copy the code

Tell Flux to pull the change or wait a minute for Flux to detect the change:

flux reconcile source git flux-system
Copy the code

Observing Flux coordinates your cluster with the latest commit:

watch flux get kustomizations
Copy the code

A few seconds later, Flager detects that the Deployment Revision has changed and starts a new rollout:

$ kubectl -n prod describe canary backend

Events:

New revision detected! Scaling up backend.prod
Starting canary analysis for backend.prod
Pre-rollout check conformance-test passed
Advance backend.prod canary weight 5
...
Advance backend.prod canary weight 50
Copying backend.prod template spec to backend-primary.prod
Promotion completed! Scaling down backend.prod
Copy the code

During the analysis, Grafana could monitor the progress of the canary. You can access dashboards through port forwarding:

kubectl -n istio-system port-forward svc/flagger-grafana 3000:80
Copy the code

Istio dashboard URL is http://localhost:3000/d/flagger-istio/istio-canary? refresh=10s&orgId=1&var-namespace=prod&var-primary=backend-primary&var-canary=backend

Note that Flagger will restart the analysis phase if new changes are applied to the deployment during Canary Analysis.

A/B testing

In addition to weighted routing, Flagger can also be configured to route traffic to Canary based on HTTP matching criteria. In A/B testing scenario, you will use HTTP Headers or cookies to locate A specific part of the user. This is especially useful for front-end applications that require session association.

You can enable A/B testing by specifying HTTP matching criteria and number of iterations:

  analysis:
    # schedule interval (default 60s)
    interval: 10s
    # max number of failed metric checks before rollback
    threshold: 10
    # total number of iterations
    iterations: 12
    # canary match condition
    match:
      - headers:
          user-agent:
            regex: ".*Firefox.*"
      - headers:
          cookie:
            regex: "^ (. *? ;) ? (type=insider)(; . *)? $"
Copy the code

The above configuration will run a two-minute analysis for Firefox users and users with internal cookies. Front-end configuration can be found in apps/frontend/canary. Found in yaml.

Trigger deployment by updating the front-end container image:

yq e '. '[0]. NewTag = "5.0.1"' -i ./apps/frontend/kustomization.yaml

git add -A && \
git commit -m "Frontend 5.0.1." " && \
git push origin main

flux reconcile source git flux-system
Copy the code

Flager detects that the deployment revision has changed and begins A/B testing:

$ kubectl -n istio-system logs deploy/flagger -f | jq .msg

New revision detected! Scaling up frontend.prod
Waiting forfrontend.prod rollout to finish: 0 of 1 updated replicas are available Pre-rollout check conformance-test passed Advance frontend.prod canary iteration 1/10... Advance frontend.prod canary iteration 10/10 Copying frontend.prod template spec to frontend-primary.prod Waitingfor frontend-primary.prod rollout to finish: 1 of 2 updated replicas are available
Promotion completed! Scaling down frontend.prod
Copy the code

You can monitor all canaries by:

$ watch kubectl get canaries --all-namespaces

NAMESPACE   NAME      STATUS        WEIGHT
prod        frontend  Progressing   100
prod        backend   Succeeded     0
Copy the code

Rollback based on Istio indicators

Flagger uses metrics provided by Istio telemetry to verify the Canary workload. The front-end application Analysis defines two metric checks:

    metrics:
      - name: error-rate
        templateRef:
          name: error-rate
          namespace: istio-system
        thresholdRange:
          max: 1
        interval: 30s
      - name: latency
        templateRef:
          name: latency
          namespace: istio-system
        thresholdRange:
          max: 500
        interval: 30s
Copy the code

Prometheus query, which checks for error rates and delays, is located in Flagger-metrics.yaml

During Canary analysis, you can generate HTTP 500 errors and high latency to test Flagger’s rollback.

Generate HTTP 500 errors:

watch curl -b 'type=insider' http://<INGRESS-IP>/status/500
Copy the code

Build delay:

watch curl -b 'type=insider' http://<INGRESS-IP>/delay/1
Copy the code

When the number of failed checks reaches canary analysis threshold (THRESHOLD), traffic is routed back to the primary server, Canary scales to zero, and rollout is marked as a failure.

$ kubectl -n istio-system logs deploy/flagger -f | jq .msg

New revision detected! Scaling up frontend.prod
Pre-rollout check conformance-test passed
Advance frontend.prod canary iteration 1/10
Halt frontend.prod advancement error-rate 31 > 1
Halt frontend.prod advancement latency 2000 > 500
...
Rolling back frontend.prod failed checks threshold reached 10
Canary failed! Scaling down frontend.prod
Copy the code

You can extend your analysis with custom metric checks for Prometheus, Datadog, and Amazon CloudWatch.

See the documentation for configuring Canary analysis alerts for Slack, MS Teams, Discord, or Rocket.

I am weishao wechat: uuhells123 public number: hackers afternoon tea add my wechat (mutual learning exchange), pay attention to the public number (for more learning materials ~)Copy the code