Today we will play chaostoolKit, an open source tool for chaos engineering.
Its goal is to provide a free, open, community-driven toolset and API.
Official source link: github.com/chaostoolki…
To understand this tool, you must know the main points mentioned in chaos engineering principles. As follows:
Remember the first point here, establishing the steady-state hypothesis.
Before running the tool, let’s take a look at its architecture.
ChaosToolkit operates your system under test via Drivers.
Its function points include the following:
Let’s set up the tools and play with them.
Context: CentOS7.8, k8s 1.19.5, example application
Python3 sudo yum install python3 python3-venv install pipenv gaolou@GaoMacPro ~ % pip3 install Pipenv install chaos toolkit Pip3 install -u Chaostoolkit pip3 install -u Chaostoolkit -kubernetes Pip3 install -u Chaostoolkit -reporting If you need to operate on other platforms, you can also install extensions.
Python3 -m venv. bundler source-bundler /bin/activate Python3 -m venv. bundler source-bundler /bin/activate
Above the installation process is performed on k8s master machine, if you are not on the k8s installed, you can configure the corresponding k8s context, the specific operation, please reference: chaostoolkit.org/drivers/kub…
The Chaos Discover test starts with the Discover command, chaostoolKit will generate a discovery.json file from the contents of./kube/config, which contains a collection of all the actions that can be performed on K8s. The result is as follows:
(.bundler) [root@s5 chaostoolkit_scenarios]# chaos discover chaostoolkit-kubernetes [2021-06-23 12:18:07 INFO] Attempting to download and install package ‘chaostoolkit-kubernetes’ [2021-06-23 12:18:08 INFO] Package downloaded and installed in current environment [2021-06-23 12:18:09 INFO] Discovering capabilities from chaostoolkit-kubernetes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.pod.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.pod.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.replicaset.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.statefulset.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.statefulset.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.crd.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.crd.probes [2021-06-23 12:18:09 INFO] Discovery outcome Saved in./ Discovery.json (.bundler) [root@s5 Chaostoolkit_scenarios]# Chaos init generation test
Execute the initialization command to create a chaos experiment as prompted.
(.bundler) [root@s5 chaostoolkit_scenarios]# chaos init You are about to create an experiment. This wizard will walk you through each step so that you can build the best experiment for your needs.
An experiment is made up of three elements:
- a steady-state hypothesis [OPTIONAL]
- an experimental method
- a set of rollback activities [OPTIONAL]
Only the method is required. Also your experiment will not run unless you define at least one activity (probe or action) Within it Experiment’s title: E2 #
A steady state hypothesis defines what ‘normality’ looks like in your system The steady state hypothesis is a collection of conditions that are used, at the beginning of an experiment, to decide if the system is in a recognised ‘normal’ state. The steady state conditions are then used again when your experiment is complete to detect where your system may have deviated in an interesting, weakness-detecting way
Initially you may not know what your steady state hypothesis is and so instead you might create an experiment without one This is why the stead state hypothesis is optional. Do you want to define a steady state hypothesis now? [Y /N]: Y # creates the steady-state Hypothesis. Please note that this is an important concept in chaos engineering, but it is not seen in most other chaos tools
You may now define probes that will determine the steady-state of your system. Add an activity
- all_microservices_healthy
- deployment_is_fully_available
- deployment_is_not_fully_available
- microservice_available_and_healthy
- microservice_is_not_available
- read_microservices_logs
- service_endpoint_is_initialized
- count_pods
- pod_is_not_available
- pods_in_conditions
- pods_in_phase
- pods_not_in_phase
- read_pod_logs
- statefulset_fully_available
- statefulset_not_fully_available
- get_cluster_custom_object
- get_custom_object
- list_cluster_custom_objects
- list_custom_objects
Activity (0 to escape): 1 # Select the steady-state hypothesis. In short, this is to create an expected result
!!!!!!!!! DEPRECATED!!!
- kill_microservice
- remove_service_endpoint
Do you want to use this probe? [y/N]: y # Determines whether to use the probe selected above
A steady-state probe requires a tolerance value, within which your system is in a reognised normal
state.
What is the tolerance for this probe? : normal
You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply press return to use it or not set any value. Argument’s value for ‘ns’ [default]: Chaosnamespace # Do you want to select another activity? [y/N]: y # Add an activity
- all_microservices_healthy
- deployment_is_fully_available
- deployment_is_not_fully_available
- kill_microservice
- microservice_available_and_healthy
- microservice_is_not_available
- read_microservices_logs
- service_endpoint_is_initialized
- count_pods
- pod_is_not_available
- pods_in_conditions
- pods_in_phase
- pods_not_in_phase
- read_pod_logs
- statefulset_fully_available
- statefulset_not_fully_available
- get_cluster_custom_object
- get_custom_object
- list_cluster_custom_objects
- list_custom_objects
Activity (0 to escape): 1 # Select specific action
!!!!!!!!! DEPRECATED!!! Do you want to use this probe? [y/N]: y # confirm to use the action selected above
You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply press return to use it or not set any value. Argument’s value for ‘ns’ [default]: Do you want to select another activity? [y/N]: N # Whether to add another experimental action, I won’t add it here
An experiment’s method contains actions and probes. Actions vary real-world events in your system to determine if your steady-state hypothesis is maintained when those events occur.
An experimental method can also contain probes to gather additional information about your system as your method is executed. Do you want to define an experimental method? [y/N]: y # select a test specific method to Add an activity
-
kill_microservice
-
remove_service_endpoint
-
scale_microservice
-
start_microservice
-
all_microservices_healthy
-
deployment_is_fully_available
-
deployment_is_not_fully_available
-
microservice_available_and_healthy
-
microservice_is_not_available
-
read_microservices_logs
-
service_endpoint_is_initialized
-
create_deployment
-
delete_deployment
-
scale_deployment
-
deployment_available_and_healthy
-
deployment_fully_available
-
deployment_not_fully_available
-
cordon_node
-
create_node
-
delete_nodes
-
drain_nodes
-
uncordon_node
-
get_nodes
-
delete_pods
-
exec_in_pods
-
terminate_pods
-
count_pods
-
pod_is_not_available
-
pods_in_conditions
-
pods_in_phase
-
pods_not_in_phase
-
read_pod_logs
-
delete_replica_set
-
create_service_endpoint
-
delete_service
-
service_is_initialized
-
create_statefulset
-
remove_statefulset
-
scale_statefulset
-
statefulset_fully_available
-
statefulset_not_fully_available
-
create_cluster_custom_object
-
create_custom_object
-
delete_cluster_custom_object
-
delete_custom_object
-
patch_cluster_custom_object
-
patch_custom_object
-
replace_cluster_custom_object
-
replace_custom_object
-
get_cluster_custom_object
-
get_custom_object
-
list_cluster_custom_objects
-
list_custom_objects
Activity (0 to escape): 24 # Here I select the 24th method: Delete a POD
!!!!!!!!! DEPRECATED!!! Do you want to use this action? [y/N]: y # confirm selection
You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply Press return to use it or not set any value. Argument’s value for ‘name’: DeleteRedisPOD
Argument’s value for ‘ns’ [default]: Argument’s value for ‘label_selector’ [name in ({name})]: App =redis # Enter the tag of the object to operate, so that you can find the object to operate Do you want to select another activity? [y/N]: N # Whether to add another action, I won’t add it here
An experiment may optionally define a set of remedial actions that are used to rollback the system to a given state. Do you want to add some rollbacks now? [y/N]: N # delete redis POD, because k8s will automatically pull up, so I don’t need to scroll back
Json ‘# generated test file (.bundler) [root@s5 chaostoolkit_scenarios]#
Chaos Run Example (.bundler) [root@s5 Chaostoolkit_scenarios]# Chaos Run Experiment. Json [2021-06-28 23:03:23 INFO] Validating the experiment’s syntax [2021-06-28 23:03:24 INFO] Experiment looks valid [2021-06-28 23:03:24 INFO] Running experiment: E2 [2021-06-28 23:03:24 INFO] Steady-state strategy: default [2021-06-28 23:03:24 INFO] Rollbacks strategy: default [2021-06-28 23:03:24 INFO] Steady state hypothesis: H2 [2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy [2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next releases, please use all_pods_healthy instead [2021-06-28 23:03:24 INFO] Steady state hypothesis is met! [2021-06-28 23:03:24 INFO] Playing your experiment’s method now… [2021-06-28 23:03:24 INFO] Action: delete_pods [2021-06-28 23:03:24 INFO] Steady state hypothesis: H2 [2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy [2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next releases, please use all_pods_healthy instead [2021-06-28 23:03:24 INFO] Steady state hypothesis is met! [2021-06-28 23:03:24 INFO] Let’s rollback… [2021-06-28 23:03:24 INFO] No declared rollbacks, let’s move on. [2021-06-28 23:03:24 INFO] Experiment ended with status: Completed (.bundler) [root@s5 Chaostoolkit_scenarios]# Check results before performing tests:
[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ………………………
Redis-master-b96c9795b-nqzmr 1/1 Running 0 3d9h 10.100.220.84s6 redis-slave-6b8d456947- 6r42K 1/1 Running 0 3d9h 10.100.220.86s6 redis-slave-6b8d456947-z55m5 1/1 Running 0 3d9h 10.100.53.206s7
After the test:
[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ………………………….
redis-master-b96c9795b-92rc6 0/1 ContainerCreating 0 3s s6
Redis-master-b96c9795b-nqzmr 0/1 Terminating 0 3d9h 10.100.220.84s6 Redis-slave-6b8d456947 -5m2xt 0/1 ContainerCreating 0 2s s6 redis-slave-6b8d456947-6r42K 1/1 Terminating 0 3d9h 10.100.220.86s6 redis-slave-6b8d456947-fj4xc 0/1 ContainerCreating 0 3s s7 Redis-slave-6b8d456947-z55m5 1/1 Terminating 0 3D9h 10.100.53.206s7
When POD is fully started:
[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
.
Redis-master-b96c9795b-92rc6 1/1 Running 0 5m43s 10.100.220.89s6
Redis-slave-6b8d456947-5m2xt 1/1 Running 0 5m42s 10.100.220.90s6
Redis-slave-6b8d456947 – fj4xC 1/1 Running 0 5m43s 10.100.53.211s7
[root@s5 ~]#
As you can see from the above results, the test was successfully executed and several Redispods were killed and pulled up by k8S.
Today we write this one experiment, and you can follow the same steps to generate other experiments.