Author: Ding Yuan RadonDB test principal

Responsible for RadonDB cloud database, containerized database quality performance test, iterative verification. In-depth research on performance and high availability solutions including cloud databases and containerized databases.

Following up “Chaos Engineering Tool ChaosBlade Opeator Series Introduction”, this installment will use ChaosBlade Opeator to test the application scenarios of Node class resources, including:

CPU Load Scenario
Network Delay scenario
Network packet loss scenario
Kill Specifies the process
Stop the specified process

| experimental environment

The test object

RadonDB MySQL container database based on KubeSphere platform was tested.

For details about how to deploy RadonDB MySQL, see Deploying the RadonDB MySQL Cluster in KubeSphere.

Environmental parameters

The name of the cluster	The host type	CPU	Memory	Total Disk	Node Counts	Replicate counts	Shard counts
KubeSphere	High availability type	8C	16G	500GB	4	–	–
RadonDB MySQL	–	4C	16G	POD: 50G DataDir: 10 G	3	2	1

After the test environment is deployed, you can perform verification in the following five scenarios.

1. CPU load scenario

1.1 Test Objectives

Specify a node to perform 80% CPU load verification.

1.2 Starting tests

Set yamL test parameter values.

apiVersion: chaosblade.io/v1alpha1 kind: ChaosBlade metadata: name: cpu-lode spec: experiments: - scope: node target: CPU action: fulllode desc: "increase node CPU load by names" # - "worker-s001" # Test object Node name - name: cpu-percent value: "80" # node load percentage - name: IP value:192.168.0.20 # Node load percentageCopy the code

Select a node and modify itnode_cpu_load.yamlThe names value in.

1.3 Test and Verification

On the Node Node, run the top command to see that the CPU of the Node reaches 80% of the load.

2. Network delay scenario

2.1 Test Preparations

2.2 Test Objectives

Add 3000 ms access latency to the specified node, worker-s001, and the latency will fluctuate by 1000 ms.

2.3 Starting tests

Select a node and modify itdelay_node_network_by_names.yamlThe names value in. rightworker-s001The packet loss rate of node access is 100%.

Start testing.

kubectl apply -f delay_node_network_by_names.yaml
Copy the code

View experimental status.

kubectl get blade delay-node-network-by-names -o json
Copy the code

2.4 Test and Verification

Access the Guestbook from the node.

$ time echo "" | telnet 192.168.0.18
echo ""  0.00s user 0.00s system 35% cpu 0.003 total
telnet 192.168.1.129 32436  0.01s user 0.00s system 0% cpu 3.248 total
Copy the code

Stop testing. You can delete the test process or simply delete the Blade resource.

kubectl delete -f delay_node_network_by_names.yaml

kubectl delete blade delay-node-network-by-names
Copy the code

3. Network packet loss scenario

3.1 Test Objective

The packet loss rate of the specified node is 100%.

3.2 Starting a Test

Select a node and change the names value in loss_node_network_by_names.yaml.

Run the following command to start the test.

$ kubectl apply -f loss_node_network_by_names.yaml
Copy the code

Run the following command to view the experiment status.

kubectl get blade loss-node-network-by-names -o json
Copy the code

3.3 Test and Verification

The port is the Guestbook NodePort port. The port does not respond to the request for accessing experiments, but the port that is not enabled for accessing experiments can be used normally.

Obtain the node IP address.

$ kubectl get node -o wide
Copy the code

Access to Guestbook from the experimental node – Inaccessible.

$Telnet 192.168.0.20Copy the code

Access Guestbook from a non-experimental node – Normal access.

$Telnet 192.168.0.18Copy the code

In addition, you can access the address directly from the browser and verify the test results.

Stop testing. You can delete the test process or simply delete the Blade resource.

kubectl delete -f delay_node_network_by_names.yaml

kubectl delete blade delay-node-network-by-names
Copy the code

4. Kill Specifies the process

4.1 Test Objective

Example Delete the MySQL process on the specified node.

4.2 Starting a Test

Select a node and modify itkill_node_process_by_names.yamlThe names value in.

Run the following command to start the test.

$ kubectl apply -f kill_node_process_by_names.yaml
Copy the code

Run the following command to view the experiment status.

kubectl get blade kill-node-process-by-names -o json
Copy the code

4.3 Test and Verification

Enter the experimental node.

$SSH 192.168.0.18Copy the code

Check the mysql process number.

$ ps -ef | grep mysql
root     10913 10040  0 14:10 pts/0    00:00:00 grep --color=auto mysql
Copy the code

You can see that the process number has changed.

$ ps -ef | grep mysql
Copy the code

The MySQL process number changes, indicating that it is restarted after being killed.

Stop testing. You can delete the test process or simply delete the Blade resource.

kubectl delete -f delay_node_network_by_names.yaml
kubectl delete blade delay-node-network-by-names
Copy the code

5. Stop the specified process

5.1 Test Objectives

Suspends the MySQL process on the specified node.

5.2 Starting the Test

Select a node and change the names value in stop_node_process_by_names.yaml.

Run the following command to start the test.

$ kubectl apply -f stop_node_process_by_names.yaml
Copy the code

Run the following command to view the experiment status.

kubectl get blade stop-node-process-by-names -o json
Copy the code

5.3 Test and Verification

Enter the experimental node.

$SSH 192.168.0.18Copy the code

Check the mysql process number.

$ ps -ef | grep mysql
root     10913 10040  0 14:10 pts/0    00:00:00 grep --color=auto mysql
Copy the code

You can see that the process number has changed.

$ ps -ef | grep mysql
Copy the code

The MySQL process number changes, indicating that it is restarted after being killed.

Stop testing. You can delete the test process or simply delete the Blade resource.

kubectl delete -f delay_node_network_by_names.yaml
kubectl delete blade delay-node-network-by-names
Copy the code

| epilogue

By using ChaosBlade Operator to conduct chaos engineering experiment on KubeSphere Node resources, the following conclusions can be drawn:

For Node nodes, ChaosBlade still has simple configuration and operation to complete complex experiments. It can realize complex failures of various Node levels through free combination to verify the stability and availability of Kubernetes cluster. At the same time, when the real fault comes, because it has already simulated a variety of fault situations, you can quickly locate the source of the fault, do not panic, easy to deal with the fault.

Next up

The next installment will use the deployed ChaosBlade Opeator tool to test and validate various scenarios for poD-class resources.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Cloud native chaotic engineering tools | ChaosBlade Operator Node

| experimental environment

The test object

Environmental parameters

1. CPU load scenario

1.1 Test Objectives

1.2 Starting tests

1.3 Test and Verification

2. Network delay scenario

2.1 Test Preparations

2.2 Test Objectives

2.3 Starting tests

2.4 Test and Verification

3. Network packet loss scenario

3.1 Test Objective

3.2 Starting a Test

3.3 Test and Verification

4. Kill Specifies the process

4.1 Test Objective

4.2 Starting a Test

4.3 Test and Verification

5. Stop the specified process

5.1 Test Objectives

5.2 Starting the Test

5.3 Test and Verification

| epilogue

Next up

Cloud native chaotic engineering tools | ChaosBlade Operator Node

| experimental environment

The test object

Environmental parameters

1. CPU load scenario

1.1 Test Objectives

1.2 Starting tests

1.3 Test and Verification

2. Network delay scenario

2.1 Test Preparations

2.2 Test Objectives

2.3 Starting tests

2.4 Test and Verification

3. Network packet loss scenario

3.1 Test Objective

3.2 Starting a Test

3.3 Test and Verification

4. Kill Specifies the process

4.1 Test Objective

4.2 Starting a Test

4.3 Test and Verification

5. Stop the specified process

5.1 Test Objectives

5.2 Starting the Test

5.3 Test and Verification

| epilogue

Next up

Related Posts

Communication through the Shared memory in the GO | GO on topic

Flink problem solution summary

Vert. x Practice 2 — Interactive Demo based on the Event Bus