preface

The Continuous Deployment (CD) pipeline has become an essential tool for building and testing new software. This also includes test environment setup, test execution validation, and destruction of the test environment on an ongoing, repeatable basis. We typically add automated testing tools to perform functional, load, integration, and other non-functional tests, and verify product quality both before and after we push it into production.

Now with Chaos Engineering, we can add fault injection tests to our existing automated test suite. In the CI/CD pipeline, a new step of running chaotic experiments should also be added to ensure that all code changes are reliable until they reach the user. Therefore, in the continuous deployment pipeline, we use an “automatic fault injection” approach to test reliability to detect problems early and reduce production events.

In this article, we will create a new Stage in the Jenkins pipeline to inject a controlled number of failures into the test system using Gremlin. In this article, you will learn:

  • How to deploy Jenkins instances using Docker.
  • How to create an API key in Gremlin.
  • How to launch an attack using the Gremlin API.

1. Preconditions

Before starting this tutorial, the following prerequisites are required:

Docker for easy deployment of Jenkins from container images.

Gremlin is deployed on a host running chaos experiments. This could be the same host as Jenkins, but ideally it should be the host on which the application is deployed for testing purposes.

Gremlin account (register below).

1.1 Start and run Jenkins

In this step, create an instance of Jenkins using the official Docker image. If you already have a Jenkins environment, skip to 1.3 to create a chaotic deployment pipeline.

On the command line, type the following to initialize the Jenkins instance using Docker.

docker run --publish 8080:8080 --publish 50000:50000 --name jenkins jenkins/jenkins:lts-alpine

Open http://localhost:8080 in your browser and make sure Jenkins is running.

If this is the first time you have set up Jenkins, enter the administrator password and select the installed package. The default Settings will work. Then, add an administrator user and log in to the account.

1.2 Add Gremlin API to Jenkins

In this step, enter the Gremlin API key and team ID in the Jenkins instance.

The Gremlin API key is associated with the Gremlin user account and allows Jenkins to authenticate to Gremlin without a user name or password.

The team ID is associated with the Gremlin team and allows Jenkins to run attacks, locate hosts, and perform other actions within the Gremlin team.

To get your team ID, log in to the Gremlin Web application. Click the user icon in the upper right corner, and then click Team Settings. Click on the Configuration TAB to see the team ID:

Copy the team ID because you need it in the next step.

Next, we will create an API key. Click the user icon in the upper right corner, and then click Account Settings. Click the API Keys TAB, and then click New API Key. Enter a name for the key (for example, “Jenkins”) and an optional description, then click Save. Copy the key from the window that appears.

Now that we have our team ID and API key, enter them into Jenkins:

http://localhost:8080/credentials/store/system/domain/_/newCredentials

Or open the Jenkins Dashboard in Manage Jenkins > Manage Credentials > Global. Click Add Credentials. Select Kind as Secret Text, and set Scope to Global, as shown below. Paste the Gremlin API key into the Secret field, and then enter the Gremlin -api-key as the ID. Click OK to save.

Repeat this step for the team ID. Select Secret Text, paste the ID into the Secret field, and then enter the gremlin-team-ID field. Click OK to save.

1.3 Create a Chaotic Deployment Pipeline

In this step, we will create a Jenkins pipeline. This pipeline will run a CPU attack that consumes CPU capacity on the target host over a period of time. The target was one of the Gremlin hosts we had installed.

In a typical CI/CD pipeline, the pipeline code would include the following steps:

  • Provide a test environment
  • Deploying the application
  • Deploy the Gremlin agent to the environment
  • Running attack

Here, we’ll skip the first three steps and just show you how to run an attack using the Gremlin API.

When running automated tests, we recommend starting with the development/test environment and gradually applying them to the production environment. Running automated experiments on production deployments helps capture production-specific reliability issues.

On the Jenkins main screen, click New Item. Enter a name, such as “CHAOS Pipeline,” select Pipeline, and then click OK. Scroll down to the Pipeline section and enter the following code:

pipeline { agent none environment { ATTACK_ID = '' GREMLIN_API_KEY = credentials('gremlin-api-key') GREMLIN_TEAM_ID = credentials('gremlin-team-id') } parameters { string(name: 'TARGET_IDENTIFIER', defaultValue: 'gremlin-demo-lab-host', description: 'Host to target') string(name: 'CPU_LENGTH', defaultValue: '30', description: 'Duration of CPU attack') string(name: 'CPU_CORE', defaultValue: '1', description: 'Number of cores to impact') string(name: 'CPU_CAPACITY', defaultValue: '100', description: 'The percentage of total CPU capacity to consume') } stages { stage('Initialize test environment') { steps{ echo "[Add commands to create a test environment.]" } } stage('Install application to test environment') { steps{ echo "[Add commands to deploy your application to your test environment.]" } } stage('Run chaos experiment') { agent any steps { script { ATTACK_ID = sh ( script: "curl -s -H 'Content-Type: application/json; charset=utf-8' -H 'Authorization: Key ${GREMLIN_API_KEY}' https://api.gremlin.com/v1/attacks/new?teamId=${GREMLIN_TEAM_ID} --data '{ \"command\": { \"type\": \"cpu\", \"args\": [\"-c\", \"$CPU_CORE\", \"-l\", \"$CPU_LENGTH\", \"-p\", \"$CPU_CAPACITY\"] },\"target\": { \"type\": \"Exact\", \"hosts\" : { \"ids\": [\"$TARGET_IDENTIFIER\"] } } }' --compressed", returnStdout: true ).trim() echo "View your experiment at https://app.gremlin.com/attacks/${ATTACK_ID}" } } } } }

Let’s take a closer look at this script.

First, in the Environment section, get the Gremlin API key and team ID. In the parameters, we define the parameters of the attack. TARGET_IDENTIFIER is the host name of the attack object, for example here we use gremlin-demo-lab-host. A list of Hosts can be found in the Gremlin Web application by clicking ‘Clients’ >’ Hosts’ :

Next comes the Stage part. The first two stages are the steps to add provisioning and setting up the test environment. In the third Stage, “Run Chaos Experiment”, Gremlin API is called to start the attack. Notice the script field, which contains the full call to the Gremlin API.

Replace this field with any Gremlin API call of your choice, whether it’s calling a different type of Attack, Scenario, Kubernetes Resource, or Service.

For now, replace the default with TARGET_IDENTIFIER, which is the host name of the attack object. Occasionally, we also adjust the CPU attack parameters CPU_LENGTH, CPU_CORE, and CPU_CAPACITY. Among them

  • CPU_LENGTHIs how long the attack will run (in seconds)
  • CPU_COREIs the number of CPU cores affected
  • CPU_CAPACITYIs the percentage of CPU consumed

Next, run the demo script by selecting “Build with Parameters” and then “Build”. Jenkins will quickly complete the first two stages, then call the Gremlin API and launch an attack.

The Stage view looks like this:

Note: if the build fails, will receive a groovy. Lang. MissingPropertyException: No to property: CPU \ _CORE for class: groovy. Lang. Binding, please try to rebuild.

Click on the build number to open the console output and you’ll see the following:

Started by user Admin Running in Durability level: MAX_SURVIVABILITY [Pipeline] Start of Pipeline [Pipeline] withCredentials Masking supported pattern matches of $GREMLIN_API_KEY [Pipeline] { [Pipeline] withEnv [Pipeline] { [Pipeline] stage [Pipeline] { (Initialize test environment) [Pipeline] echo [Add commands to create a test environment.] [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Install application to test environment) [Pipeline] echo [Add commands to deploy your application to  your test environment.] [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Run chaos experiment) [Pipeline] node Running on Jenkins in /var/jenkins_home/workspace/Chaos Pipeline [Pipeline] { [Pipeline] script [Pipeline] { [Pipeline] sh Warning: A secret was passed to "sh" using Groovy String interpolation, which is insecure. Affected argument(s) used the following variable(s): [GREMLIN_API_KEY] See https://jenkins.io/redirect/groovy-string-interpolation for details. + curl -s -H 'Content-Type: application/json' -H 'Authorization: Key ****' https://api.gremlin.com/v1/attacks/new --data '{ "command": { "type": "cpu", "args": ["-c", "1", "-l", "30", "-p", "100"] },"target": { "type": "Exact", "hosts" : { "ids": ["gremlin-demo-lab-host"] } } }' --compressed [Pipeline] echo View your experiment at https://app.gremlin.com/attacks/User requires privilege for target team: TEAM_DEFAULT [Pipeline] } [Pipeline] // script [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // withCredentials [Pipeline] End of Pipeline Finished: SUCCESS

A: congratulations! Chaos experiments have now been integrated into the CI/CD pipeline!

conclusion

This article is just the first step in enabling chaos engineering in a CI/CD pipeline.

In the future, we can further extend the chaotic engineering practice by running scenarios instead of attacks, verify the completion of experiments, use state checking to automatically stop experiments when the system is unstable, or run experiments simultaneously in integration or load tests. If there are automated load or functional tests, run them in conjunction with chaotic experiments to ensure that the system performs reliably under pressure.

These same principles can be applied to other continuous deployment tools, such as Spinnaker, GitLab, or Circleci.

Source: Chaos Engineering Practice

Author: Ruan An

Disclaimer: The article was forwarded on IDCF public account (devopshub) with the authorization of the author. Quality content to share with the technical partners of the Sifou platform, if the original author has other considerations, please contact Xiaobian to delete, thanks.

June every Thursday evening at 8 o ‘clock, [winter brother has words] happy a “summer”. Public message “playback” can get video will look at the address

  • 0603 invincible brother “IDCF talent growth map and 5P” (” end-to-end DevOps continuous delivery (5P) class “the first lesson)
  • 0610 Dong Ge “Take you to play innovative design thinking”
  • 0617 “What is Agile Project Management?”
  • 0624 “Agile Leadership in the Age of Vuca”