This is a hands-on post by community user Fan Fan explaining how to do Nebula performance testing and performance tuning for the data import section. The “I” used in the following paragraphs refers to the numerous users.

Summary of 0.

Thanks for consulting Nebula’s officials several times during Nebula’s research work and later performance testing for use.

I put their test process sorted out, I hope to inspire you, if you have better comments, please do not hesitate to comment!

1. Deploy the Nebula cluster

Four physical machines (1, 2, 3, and 4) are prepared. Each machine is configured with a 96C CPU, 512 GB memory, and SSD disks. Machine distribution:

  • 1: Meta, storage
  • 2: storage
  • 3: storage
  • 4: graphd

The installation process is not detailed, using the RPM way. Nebula -import-2.0, nebula-bench-2.0, nebula-import-2.0, nebula-bench-2.0, nebula-import-2.0, nebula-bench-2.0

2. Import data

The data data structure imported this time is 7 point types and 15 edge types. The data volume is not large and the structure is very simple. The total data volume is 3400W, but it needs to be processed into so many point and edge tables in advance.

Create the space and set vid=100, replica_factor=3, partition_num=100.

Nebula – Importer data import optimization

Using the nebula-importer to import and dry directly at 3w/s is too slow. Look at the import file. The only parameters to use are concurrency, channelBufferSize, batchsize

First adjust to try, literally changed to change, the effect is not obvious, post to consult big guy. Nebula – Import 2.0 is a slow version of the nebula- Import 2.0 version

Concurrency: 96 # CPU core number channelBufferSize: 20000 Batchsize: 2500Copy the code

The speed is about 7-8W, yeah, it looks a lot faster, but if you make it a little bit bigger, the graphd will crash, but it still can’t be too big, so I want to make these parameters as big as possible, but not too big.

Then confirm the disk and network, unexpectedly using mechanical disk and network… Change to SSD, and then switch to ten thousand M network, the speed directly increased to more than double, about 17W /s, it seems that hardware is still very important.

Notice that vid and partition_num are long and I want to make them shorter but I can’t change them because they are that long. Then partition_num, look at the official instructions, 2-10 times the size of the disk, change it to 15, it does make a difference. The speed is 25w/s. So far, I’m satisfied, and there may be some improvement after modification, but I’ve already met the requirements. Let’s stop here.

summary

  • Concurrency is set to the number of CPU cores, channelBufferSize and batchsize as large as possible, but not more than the cluster load.
  • The hardware should use SSD and 10,000 M network
  • The partition_num of the space partition should be reasonable and not too large
  • I guess the length of VID, the number of attributes, and the number of graphDs all have an effect, but I haven’t tried yet

3. Stress test

According to the indicators used in business, one is selected for testing. The indicators are as follows:

match (v:email)-[:emailid]->(mid:id)<-[:phoneid]-(phone:phone)-[:phoneid]->(ids:id) where id(v)=="replace" with v, count(distinct phone) as pnum,count(distinct mid) as midnum,count(distinct ids) as idsnum , sum(ids.isblack) as black  where pnum > 2 and midnum>5 and midnum < 100 and idsnum > 5 and idsnum < 300 and black > 0 return v.value1, true as result
Copy the code

The statement is a third-degree diffusion + conditional judgment, and the number of points involved in the centralized data is about 200-400.

JMX configuration file, change threadGroup. num_threads to the number of CPU cores, and other parameters, such as loop, NGQL, as required. Variables in NGQL should be replaced with replace.

As the test data is relatively concentrated, the test result of this part is 700/s. When the data is expanded to all nodes, it reaches 6000+/s. Concurrency looks ok, and the query speed is ok, up to 300ms.

Since I had a single node here, I wanted to add a Graphd to test to see if the concurrency was improved. Then I directly started a Graphd process and tested it again and found no improvement.

Then I just saw the release of 2.0.1, so I rebuilt the cluster, imported the data again, and used 3 graphDs. The performance was directly tripled, the centralized data reached 2100+/s, and all nodes reached nearly 2W. So it’s strange that you can’t add graph nodes to nebula- Bench 2.0.

I guess it may be because there is no blance or compact after adding graphd. You can try it when you are free.

In addition, because there is no use of some monitoring components, only the Command of Linux to view, so there is not too accurate machine status information.

summary

  • Before testing, ensure that the cluster load balance and compact
  • Adjust the configuration of storage to increase the number of available threads and the size of cache memory
  • Concurrency has a lot to do with data, so it is meaningless to combine with your own data distribution.

4. Configuration

The parameters meta and Graphd are set to the default Settings. There is nothing special to change. I just paste storage and explain.

Rocksdb_block_cache = 100G num_io_threads=48 # Min_vertices_per_bucket =100 # Minimum number of points in a bucket vertex_cache_bucket_exp=8 # The total number of buckets is 2 ^ 8. Wal_buffer_size =16777216  # 16 M write_buffer_size:268435456 # 256 MCopy the code

The parameters here are based on browsing various posts and searching the official code, which may not be particularly accurate and is also explored. Other parameters have not been specially modified. There are a lot of parameters that are not exposed, and it is not recommended to modify them casually, so if you need to know, go to GitHub source code to check.

At the end

Overall, this test wasn’t particularly professional, but Nebula did a good job of testing it against specific business scenarios. The adjustment of specific parameters has not been studied thoroughly, so it needs to be studied later in use. If you have good ideas about tuning, please feel free to speak freely.

Ac graph database technology? Sign up for the Nebula Exchange, NUC 2021 Sign up for the Portal, and we’ll be in Beijing waiting for you to speak with Nebula