In this guide, you will gain hands-on experience with Progressive Delivery using Kubernetes and Istio using GitOps.
introduce
gitops-istio
What are GitOps?
GitOps is a way of continuous delivery that uses Git as a real source of declarative infrastructure and workloads. For Kubernetes, this means using Git push instead of Kubectl apply/delete or helm install/upgrade.
In this Workshop, you will use GitHub to host the configuration repository and deliver Flux as a GitOps solution.
What is incremental delivery?
Progressive delivery is the umbrella term for advanced deployment patterns such as Canary Canaries, feature Flags, and A/B testing. Incremental delivery reduces the risk of introducing new software versions into production by giving application developers and SRE teams fine-grained control over blast radius.
In this workshop, you will use Flagger and Prometheus to automate Canary publishing and A/B Testing for your application.
The preparatory work
You will need Kubernetes cluster v1.16 or higher with LoadBalancer support. For testing purposes, you can use Minikube with 2 cpus and 4GB of memory.
Use Homebrew to install flux CLI:
brew install fluxcd/tap/flux
Copy the code
Binaries for macOS AMD64/ARM64, Linux AMD64/ARM, and Windows are available for download at flux2 Release Page.
Verify that your cluster meets the prerequisites:
flux check --pre
Copy the code
Install JQ and YQ using Homebrew:
brew install jq yq
Copy the code
Fork the repository and clone it:
git clone https://github.com/<YOUR-USERNAME>/gitops-istio
cd gitops-istio
Copy the code
Cluster bootstrap
Using the Flux bootstrap command, you can install Flux on a Kubernetes cluster and configure it to manage itself from a Git repository. If Flux components exist on the cluster, the bootstrap command will perform the upgrade as needed.
Boot Flux by specifying your GitHub repository fork URL:
flux bootstrap git \
--author-email=<YOUR-EMAIL> \
--url=ssh://[email protected]/<YOUR-USERNAME>/gitops-istio \
--branch=main \
--path=clusters/my-cluster
Copy the code
The command above requires ssh-agent, see the Flux Bootstrap Github documentation if you are using Windows.
At boot time, Flux generates an SSH key and prints the public key. To synchronize your cluster status with Git, you need to copy the public key and use write to create a deploy key to access your GitHub repository. On GitHub, go to Settings > Deploy Keys, click Add Deploy Key, check ☑️ Allow Write Access, paste Flux public key, and click Add Key.
When Flux visits your repository, it does the following:
- The installation
Istio operator
- Waiting for the
Istio
Control plane ready - The installation
Flagger
,Prometheus
和Grafana
- create
Istio
Common gateway - create
prod
The namespacenamespace
- Create a load tester (
load tester
)deployment
- Create a front end (
frontend
)deployment
And the canarycanary
- Create a backend (
backend
)deployment
And the canarycanary
When booting a cluster using Istio, it is important to define the apply order. For application Pods to use Istio Sidecar injection, the Istio control plane must be up and running before the application.
In Flux V2, you can specify the order of execution by defining dependencies between objects. For example, in clusters/my-cluster/apps.yaml we tell Flux that the coordination of apps depends on an istio-system:
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 30m0s
dependsOn:
- name: istio-system
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps
Copy the code
First observe Flux install Istio, then observe Demo Apps:
watch flux get kustomizations
Copy the code
You can trace the Flux Reconciliation log using the following command:
flux logs --all-namespaces --follow --tail=10
Copy the code
Istio customization and upgrades
You can use in istio/system/profile. The yaml IstioOperator resources custom istio installation:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: istio-default
namespace: istio-system
spec:
profile: demo
components:
pilot:
k8s:
resources:
requests:
cpu: 10m
memory: 100Mi
Copy the code
After changing the Istio Settings, you can push the changes to Git and Flux will apply it on the cluster. The Istio Operator will reconfigure the Istio control plane based on your changes.
When a new Istio version becomes available, the update-Istio GitHub Action Workflow workflow opens a pull Request containing the list updates required to upgrade the Istio Operator. The new Istio version is tested on Kubernetes Kind with E2E Workflow, and Flux will upgrade Istio within the cluster when PR is merged into the main branch.
Application boot
When Flux synchronizes Git repositories with your cluster, it creates front-end/back-end deployment (Frontend/Backend Deployment), HPA, and a canary object. Flagger uses the Canary definition to create a series of objects: Kubernetes deployments, ClusterIP Services, Istio Destination Rules and Virtual Services. These objects expose applications on the mesh and drive Canary Analysis and promotion.
# applied by Flux
deployment.apps/frontend
horizontalpodautoscaler.autoscaling/frontend
canary.flagger.app/frontend
# generated by Flagger
deployment.apps/frontend-primary
horizontalpodautoscaler.autoscaling/frontend-primary
service/frontend
service/frontend-canary
service/frontend-primary
destinationrule.networking.istio.io/frontend-canary
destinationrule.networking.istio.io/frontend-primary
virtualservice.networking.istio.io/frontend
Copy the code
Check to see if Flagger successfully initializes canary:
kubectl -n prod get canaries
NAME STATUS WEIGHT
backend Initialized 0
frontend Initialized 0
Copy the code
When frontend primary is deployed online, Flager routes all traffic to the primary Pod and deploys frontend scale to zero.
Use the following command to find the address of the ingress gateway:
kubectl -n istio-system get svc istio-ingressgateway -ojson | jq .status.loadBalancer.ingress
Copy the code
Open your browser and navigate to the entry address, and you’ll see the front-end UI.
Published by Canary
Flagger implements a control loop that gradually diverts traffic to Canary while measuring key performance metrics such as HTTP request success rate, average request duration, and POD health. Based on the analysis of the KPIs, canary is upgraded or suspended and the results of the analysis are published to Slack.
Canary analysis is triggered by changes to any of the following objects:
- The deployment of
PodSpec
(Container image, command, port, environment, etc.) ConfigMaps
和Secrets
As the volume (volumes
Mount or map to environment variables
For workloads that do not receive constant traffic, The Flagger can configure a Webhook that, when called, launches a load test for a target workload. The canary configuration can be found on apps/backend/canary. Yaml.
Pull changes from GitHub:
git pull origin main
Copy the code
To trigger the canary deployment of the back-end application, collision the container image:
yq e '. '[0]. NewTag = "5.0.1"' -i ./apps/backend/kustomization.yaml
Copy the code
Commit and push changes:
git add -A && \
git commit -m "Backend 5.0.1." " && \
git push origin main
Copy the code
Tell Flux to pull the change or wait a minute for Flux to detect the change:
flux reconcile source git flux-system
Copy the code
Observing Flux coordinates your cluster with the latest commit:
watch flux get kustomizations
Copy the code
A few seconds later, Flager detects that the Deployment Revision has changed and starts a new rollout:
$ kubectl -n prod describe canary backend
Events:
New revision detected! Scaling up backend.prod
Starting canary analysis for backend.prod
Pre-rollout check conformance-test passed
Advance backend.prod canary weight 5
...
Advance backend.prod canary weight 50
Copying backend.prod template spec to backend-primary.prod
Promotion completed! Scaling down backend.prod
Copy the code
During the analysis, Grafana could monitor the progress of the canary. You can access dashboards through port forwarding:
kubectl -n istio-system port-forward svc/flagger-grafana 3000:80
Copy the code
Istio dashboard URL is http://localhost:3000/d/flagger-istio/istio-canary? refresh=10s&orgId=1&var-namespace=prod&var-primary=backend-primary&var-canary=backend
Note that Flagger will restart the analysis phase if new changes are applied to the deployment during Canary Analysis.
A/B testing
In addition to weighted routing, Flagger can also be configured to route traffic to Canary based on HTTP matching criteria. In A/B testing scenario, you will use HTTP Headers or cookies to locate A specific part of the user. This is especially useful for front-end applications that require session association.
You can enable A/B testing by specifying HTTP matching criteria and number of iterations:
analysis:
# schedule interval (default 60s)
interval: 10s
# max number of failed metric checks before rollback
threshold: 10
# total number of iterations
iterations: 12
# canary match condition
match:
- headers:
user-agent:
regex: ".*Firefox.*"
- headers:
cookie:
regex: "^ (. *? ;) ? (type=insider)(; . *)? $"
Copy the code
The above configuration will run a two-minute analysis for Firefox users and users with internal cookies. Front-end configuration can be found in apps/frontend/canary. Found in yaml.
Trigger deployment by updating the front-end container image:
yq e '. '[0]. NewTag = "5.0.1"' -i ./apps/frontend/kustomization.yaml
git add -A && \
git commit -m "Frontend 5.0.1." " && \
git push origin main
flux reconcile source git flux-system
Copy the code
Flager detects that the deployment revision has changed and begins A/B testing:
$ kubectl -n istio-system logs deploy/flagger -f | jq .msg
New revision detected! Scaling up frontend.prod
Waiting forfrontend.prod rollout to finish: 0 of 1 updated replicas are available Pre-rollout check conformance-test passed Advance frontend.prod canary iteration 1/10... Advance frontend.prod canary iteration 10/10 Copying frontend.prod template spec to frontend-primary.prod Waitingfor frontend-primary.prod rollout to finish: 1 of 2 updated replicas are available
Promotion completed! Scaling down frontend.prod
Copy the code
You can monitor all canaries by:
$ watch kubectl get canaries --all-namespaces
NAMESPACE NAME STATUS WEIGHT
prod frontend Progressing 100
prod backend Succeeded 0
Copy the code
Rollback based on Istio indicators
Flagger uses metrics provided by Istio telemetry to verify the Canary workload. The front-end application Analysis defines two metric checks:
metrics:
- name: error-rate
templateRef:
name: error-rate
namespace: istio-system
thresholdRange:
max: 1
interval: 30s
- name: latency
templateRef:
name: latency
namespace: istio-system
thresholdRange:
max: 500
interval: 30s
Copy the code
Prometheus query, which checks for error rates and delays, is located in Flagger-metrics.yaml
During Canary analysis, you can generate HTTP 500 errors and high latency to test Flagger’s rollback.
Generate HTTP 500 errors:
watch curl -b 'type=insider' http://<INGRESS-IP>/status/500
Copy the code
Build delay:
watch curl -b 'type=insider' http://<INGRESS-IP>/delay/1
Copy the code
When the number of failed checks reaches canary analysis threshold (THRESHOLD), traffic is routed back to the primary server, Canary scales to zero, and rollout is marked as a failure.
$ kubectl -n istio-system logs deploy/flagger -f | jq .msg
New revision detected! Scaling up frontend.prod
Pre-rollout check conformance-test passed
Advance frontend.prod canary iteration 1/10
Halt frontend.prod advancement error-rate 31 > 1
Halt frontend.prod advancement latency 2000 > 500
...
Rolling back frontend.prod failed checks threshold reached 10
Canary failed! Scaling down frontend.prod
Copy the code
You can extend your analysis with custom metric checks for Prometheus, Datadog, and Amazon CloudWatch.
See the documentation for configuring Canary analysis alerts for Slack, MS Teams, Discord, or Rocket.
I am weishao wechat: uuhells123 public number: hackers afternoon tea add my wechat (mutual learning exchange), pay attention to the public number (for more learning materials ~)Copy the code