background
Performance testing is a very frequent event for databases. To optimize query engine rules and adjust storage engine parameters, you need to pass a performance test to check the impact of the system in different scenarios.
Even if the same code, the same parameter configuration, different machine resource configuration, different business scenarios also have a great difference, record the internal pressure test practice process, there is a reference.
In this document, the operating system is CentOS 7.8 based on x86.
The machine with Nebula is configured with 4C 16GB memory, SSD disks, and ten gigabit network.
tool
- Nebula – Ansible is used to deploy the Nebula services
- The nebula-importer is used to import data into the Nebula cluster
- K6 – Plugin K6 pressure tool that uses the GO client to make requests to the Nebula cluster
- Nebula – Bench integrates the generation of LDBC datasets, data import, and compression.
- Ldbc_snb_datagen_hadoop LDBC data generation tool
An overview of the
Data use LDBC data set automatically generated by LDBC_SNB_DATagen. The overall process is shown as follows.
Deployment topology with 1 machine as a load machine and 3 machines in a Nebula cluster
In order to facilitate monitoring, the load measuring machine is also deployed:
- Promethues
- Influxdb
- Grafana
- node-exporter
Also deployed on the Nebula machine:
- node-exporter
- process-exporter
Specific steps
Deploy Nebula using Nebula – Ansible
- Initialize the user and connect to SSH
- Log in to 192.168.8.60, 192.168.8.61, 192.168.8.62, 192.168.8.63, create vesoft user, add to sudoer, and set NOPASSWD.
- Log in to 192.168.8.60 and access SSH
Ssh-keygen ssh-copy-id [email protected] ssh-copy-id [email protected] ssh-copy-id [email protected]Copy the code
- Download nebula- Ansible, install Ansible, and modify the Ansible configuration
sudo yum install ansible -y
git clone https://github.com/vesoft-inc/nebula-ansible
cd nebula-ansible/
The default CDN is international CDN
sed -i 's/oss-cdn.nebula-graph.io/oss-cdn.nebula-graph.com.cn/g' group_vars/all.yml
Copy the code
The inventory. Ini example
[all:vars]
# GA or nightly
install_source_type = GA
nebula_version = 2.01.
os_version = el7
arc = x86_64
pkg = rpm
packages_dir = {{ playbook_dir }}/packages
deploy_dir = /home/vesoft/nebula
data_dir = {{ deploy_dir }}/data
# ssh user
ansible_ssh_user = vesoft
force_download = False
[metad]
192.1688.61-63. []
[graphd]
192.1688.61-63. []
[storaged]
192.1688.61-63. []
Copy the code
- Install and start nebula
ansible-playbook install.yml
ansible-playbook start.yml
Copy the code
The deployment of monitor
Docker and Docker-compose need to be installed on the machine to facilitate deployment.
Log in to the pressure tester 192.168.8.60
git clone https://github.com/vesoft-inc/nebula-bench.git
cd nebula-bench
cp -r third/promethues ~/.
cp -r third/exporter ~/.
cd ~/exporter/ && docker-compose up -d
cd ~/promethues
Modify the address of the monitor node's EXPORTER
# vi prometheus.yml
docker-compose up -d
Copy my friend to 192.168.8.61, 192.168.8.62, 192.168.8.63 and start docker-compse
Copy the code
Configure grafana’s data source and dashboard, see github.com/vesoft-inc/… .
Generate the LDBC dataset
cdnebula-bench sudo yum install -y git \ make \ file \ libev \ libev-devel \ gcc \ wget \ python3 \ python3-devel \ Maven pip3 install --user -r requireses.txtGenerate sF1 by default, 1GB of data, 300W + dots, 1700W + edges
python3 run.py data
Mv generates good data
mv target/data/test_data/ ./sf1
Copy the code
Import data
cd nebula-bench
# change. Evn
cp env .env
vi .env
Copy the code
Here is an example of.env
DATA_FOLDER=sf1 NEBULA_SPACE=sf1 NEBULA_USER=root NEBULA_PASSWORD=nebula NEBULA_ADDRESS = 192.168.8.61:9669192168 8.62:9669192168 8.63:9669#NEBULA_MAX_CONNECTION=100INFLUXDB_URL = http://192.168.8.60:8086/k6Copy the code
# Compile nebula- Importer and K6
./scripts/setup.sh
# import data
python3 run.py nebula importer
Copy the code
During the import process, you can focus on the following network bandwidths and disk I/O writes.
Perform pressure test
python3 run.py stress run
Copy the code
Based on the code in the Scenarios, js files are automatically rendered and all scenarios are compressed using K6.
After execution, the JS files and pressure test results are in the Output folder.
Latency indicates the latency returned by the server, and responseTime indicates the latency returned by the client from execute to receive (unit: us).
[vesoft@qa-60 nebula-bench]$ more output/result_Go1Step.json
{
"metrics": {
"data_sent": {
"count": 0."rate": 0}."checks": {
"passes": 1667632,
"fails": 0."value": 1}."data_received": {
"count": 0."rate": 0}."iteration_duration": {
"min": 0.610039."avg": 3.589942336582023."med": 2.9560145."max": 1004.232905."p(90)": 6.351617299999998."p(95)": 7.997563949999995."p(99)": 12.121579809999997},"latency": {
"min": 308,
"avg": 2266.528722763775."med": 1867,
"p(90)": 3980,
"p(95)": 5060,
"p(99)": 7999}."responseTime": {
"max": 94030,
"p(90)": 6177,
"p(95)": 7778,
"p(99)": 11616,
"min": 502,
"avg": 3437.376111156418."med": 2831}."iterations": {
"count": 1667632,
"rate": 27331.94978169588},"vus": {
"max": 100,
"value": 100,
"min": 0
Copy the code
[vesoft@qa-60 nebula-bench]$ head -300 output/output_Go1Step.csv | grep -v USE timestamp,nGQL,latency,responseTime,isSucceed,rows,errorMsg 1628147822,GO 1 STEP FROM 4398046516514 OVER KNOWS, 1217153 (6),true,1,
1628147822,GO 1 STEP FROM 2199023262994 OVER KNOWS,1388,1829,true,94,
1628147822,GO 1 STEP FROM 1129 OVER KNOWS,1488,2875,true, 12, 1628147822,GO 1 STEP FROM 6597069771578 OVER KNOWS,1139,1647,true,30,
1628147822,GO 1 STEP FROM 2199023261211 OVER KNOWS,1399,2096,true,6,
1628147822,GO 1 STEP FROM 2199023256684 OVER KNOWS,1377,2202,true,4,
1628147822,GO 1 STEP FROM 4398046515995 OVER KNOWS,1487,2017,true,39, 1628147822,GO 1 STEP FROM 10995116278700 OVER KNOWS,true,3, 1628147822,GO 1 STEP FROM 933 OVER KNOWS,1130,true,5, 1628147822,GO 1 STEP FROM 6597069771971 OVER KNOWS,true,60,
1628147822,GO 1 STEP FROM 10995116279952 OVER KNOWS,1221,1758,true,3, 1628147822,GO 1 STEP FROM 8796093031179 OVER KNOWS,true,13,
1628147822,GO 1 STEP FROM 10995116279792 OVER KNOWS,1115,1858,true,6,
1628147822,GO 1 STEP FROM 6597069777326 OVER KNOWS,1223,2016,true,4,
1628147822,GO 1 STEP FROM 8796093028089 OVER KNOWS,1361,2054,true,13,
1628147822,GO 1 STEP FROM 6597069777454 OVER KNOWS,1219,2116,true,2, 1628147822,GO 1 STEP FROM 13194139536109 OVER KNOWS,true,2,
1628147822,GO 1 STEP FROM 10027 OVER KNOWS,2212,3016,true,83,
1628147822,GO 1 STEP FROM 13194139544176 OVER KNOWS,855,1478,true,29,
1628147822,GO 1 STEP FROM 10995116280047 OVER KNOWS,1874,2211,true,12,
1628147822,GO 1 STEP FROM 15393162797860 OVER KNOWS,714,1684,true,5, 1628147822,GO 1 STEP FROM 6597069770517 OVER KNOWS,true,7,
1628147822,GO 1 STEP FROM 17592186050570 OVER KNOWS,768,1630,true,26,
1628147822,GO 1 STEP FROM 8853 OVER KNOWS,2773,3509,true,14,
1628147822,GO 1 STEP FROM 19791209307908 OVER KNOWS,1022,1556,true,6, 1628147822,GO 1 STEP FROM 13194139544258 OVER KNOWS,true,91,
1628147822,GO 1 STEP FROM 10995116285325 OVER KNOWS,1901,2556,true,0, 1628147822,GO 1 STEP FROM 6597069774931 OVER KNOWS,true,152,
1628147822,GO 1 STEP FROM 8796093025056 OVER KNOWS,2007,2728,true,29,
1628147822,GO 1 STEP FROM 21990232560726 OVER KNOWS,1639,2364,true,9,
1628147822,GO 1 STEP FROM 8796093030318 OVER KNOWS,2145,2851,true,6,
1628147822,GO 1 STEP FROM 21990232556027 OVER KNOWS,1784,2554,true,5,
1628147822,GO 1 STEP FROM 15393162796879 OVER KNOWS,2621,3184,true,71,
1628147822,GO 1 STEP FROM 17592186051113 OVER KNOWS,2052,2990,true, 5,Copy the code
You can also test a single scenario and constantly adjust configuration parameters for comparison.
Concurrent read
Perform go 2 hops, 50 concurrent, for 300 seconds
python3 run.py stress run -scenario go.Go2Step -vu 50 -d 300
Copy the code
INFO[0302] 2021/08/06 03:55:27 [INFO] Finish init the pool ✓ IsSucceed █ setup █ teardown checks............... Those who qualify can go to......... : 100.00% ✓ 1559930 Eligible 0 data_received : 0 B 0 B/s data_sent............ : 0 B 0 B/s iteration_duration... : min = 687.47 (including s avg = 9.6 ms med = 8.04 Max ms = 1.03 s p (90) = 18.41 ms p (95) = 22.58, p = 31.87 ms (99) ms iterations... : 5181.432199 1559930 / s latency... : min=398 AVG =6847.850345 Med =5736 Max =222542 P (90)=13046 p(95)=16217 p(99)=23448 responseTime......... : min = 603 avg = 9460.857877 med = 7904 Max = 226992 p = 18262 p (90) (95) = 22429, p., (99) = 31726.71 vus... : 50 min=0 max=50 vus_max.............. : 50 min=50 max=50Copy the code
At the same time, you can observe the monitoring indicators.
The checks function checks whether the request is successfully executed. If the execution fails, the failed error message is saved in the CSV file.
awk -F ', ' '{print $NF}' output/output_Go2Step.csv|sort |uniq -c
Copy the code
Perform go 2 hops, 200 concurrent, for 300 seconds
python3 run.py stress run -scenario go.Go2Step -vu 200 -d 300
Copy the code
INFO[0302] 2021/08/06 04:02:34 [INFO] Finish init the pool ✓ IsSucceed █ setup █ teardown checks............... Those who qualify can go onto university. : 100.00% ✓ 1866850 Eligible 0 data_received........ : 0 B 0 B/s data_sent............ : 0 B 0 B/s iteration_duration... : min=724.77µs AVG =32.12ms Med =25.56ms Max =1.03s p(90)=63.07ms p(95)=84.52ms p(99)=123.92ms iterations........... : 6200.23481 1866850 / s latency... : min=395 AVg =25280.893558 med=20411 Max =312781 p(90)=48673 p(95)=64758 p(99)=97993.53 responseTime......... : min = 627 avg = 31970.234329 med = 25400 Max = 340299 p = 62907 p (90) (95) = 84361.55, p = 123750 (99) vus... : 200 min=0 max=200 vus_max.............. : 200 min=200 max=200Copy the code
K6 monitoring data from Grafana
Concurrent writes
Insert, 200 concurrent, 300 seconds, default batchSize 100
python3 run.py stress run -scenario go.Go2Step -vu 200 -d 300
Copy the code
You can manually modify the js file to adjust batchSize
sed -i 's/batchSize = 100/batchSize = 300/g' output/InsertPersonScenario.js
# Run K6 manually
scripts/k6 run output/InsertPersonScenario.js -u 400 -d 30s --summary-trend-stats "min,avg,med,max,p(90),p(95),p(99)"- summary - export output/result_InsertPersonScenario. Json -- -- out influxdb = http://192.168.8.60:8086/k6Copy the code
An error occurs when batchSize is 300 and concurrency is 400.
INFO[0032] 2021/08/06 04:03:49 [INFO] Finish init the pool qualify IsSucceed ↳ 96% -- 31257 / qualify 1103 █ setup █ teardown checks............... Those who qualify can go onto 31257 university 1103 data_received........ : 0 B 0 B/s data_sent............ : 0 B 0 B/s iteration_duration... : min=12.56ms AVg =360.11ms Med =319.12ms Max =2.07s p(90)=590.31ms p(95)=696.69ms p(99)=958.32ms iterations........... : 1028.339207 32360 / s latency... : min=4642 AVG =206931.543016 Med =206162 Max = 915671p (90)= 320397.4p (95)= 355798.7p (99)=459521.39 responseTime......... : Min = 6272 avg = 250383.122188 med = 239297.5 Max = 1497159 p (90) = 384190.5, p (95) = 443439.6, p (99) = 631460.92 vus... : 400 min=0 max=400 vus_max.............. : 400 min=400 max=400Copy the code
awk -F ', ' '{print $NF}' output/output_InsertPersonScenario.csv|sort |uniq -c
Copy the code
31660
1103 error: E_CONSENSUS_ERROR(-16)."
1 errorMsg
Copy the code
Found E_CONSENSUS_ERROR. Raft’s appendlog buffer overflow should be overflowable.
conclusion
- With LDBC as the standard data set, the data characteristics are more standard, and you can generate more data such as a billion points, but the data structure is the same.
- Using K6 as a pressure load tool, binary is more convenient than Jmeter, and because K6 uses Golang’s Goroutine underneath, it uses fewer resources than Jmeter.
- Use the tools to model scenarios or tweak Nebula’s parameters to make better use of server resources.
Nebula: A Complete Guide to the Nebula Database: A Nebula Guide to the Nebula Database: A Nebula Guide Docs.nebula-graph.com.cn/site/pdf/Ne…
Ac graph database technology? To join the Nebula Exchange group, please fill out your Nebula card and Nebula Assistant will bring you into the group