preface

Wechat official account: Xiao Lei

When efforts to a certain extent, luck from with you accidentally meet.

background

The company uses Redis as its database technology, sifting 800 million pieces of data into it every day. Therefore, a wave of redis storage performance needs to be measured. Here’s how you did it, and some of the potholes you stepped on.

Test requirements

Test using multiple threads to insert gigabit data into Redis. It is expected to insert 800 million pieces of data into Redis using 10 threads.

Server performance

This test uses your own virtual machine test:

parameter Linux command value
system cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core)
memory free -h Total: 3.7 G available: 3.3 G
Number of CPU cat /proc/cpuinfo cpu cores:2
HZ cat /proc/cpuinfo ] grep MHz ] uniq 1991.999

One, Jedis insert test separately

Insert 1W pieces of data

1. Single thread:

@test void exec() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); @test void exec() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); jedis.flushDB(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); for (int j = 1; j <= 10000; j++) { jedis.set(key+j+"",key+j+""); } long endTime = System.currentTimeMillis(); System.out.println("exec time : " + currentThread().getName()+":"+(endTime - startTime)); } }.start(); System.out.println(Thread.currentThread().getName()); Thread.sleep(40000); }Copy the code

It takes 4.7s for a single thread to insert 1W pieces of data

main
exec time : Thread-2:4784
Copy the code

2. Two threads

@test void exec() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); @test void exec() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); Jedis jedis2 = new Jedis("192.168.44.101", 6379); jedis.flushDB(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); for (int j = 1; j <= 5000; j++) { jedis.set(key+j+"",key+j+""); } long endTime = System.currentTimeMillis(); System.out.println("exec time : " + currentThread().getName()+":"+(endTime - startTime)); } }.start(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); for (int j = 5001; j <= 10000; j++) { jedis2.set(key+j+"",key+j+""); } long endTime = System.currentTimeMillis(); System.out.println("exec time : " + currentThread().getName()+":"+(endTime - startTime)); } }.start(); System.out.println(Thread.currentThread().getName()); Thread.sleep(40000); }Copy the code

It takes 1.2s for two threads to insert 1W pieces of data

main
exec time : Thread-2:1266
exec time : Thread-3:1277
Copy the code

3, four threads inserted 1W data over 800 milliseconds

main
exec time : Thread-4:877
exec time : Thread-2:878
exec time : Thread-3:879
exec time : Thread-5:884
Copy the code

Four, ten threads inserted 1W data: more than 500 milliseconds

main
exec time : Thread-2:573
exec time : Thread-4:573
exec time : Thread-3:575
exec time : Thread-8:574
exec time : Thread-11:572
exec time : Thread-5:576
exec time : Thread-7:575
exec time : Thread-10:573
exec time : Thread-6:576
exec time : Thread-9:588
Copy the code

[Conclusion analysis] : It can be seen that the time curve is normal. The more threads you have, the less time you end up consuming. But Redis claims to have 10W of throughput, which is very poor when using this common method of insertion!! The main reason is that when we insert, there will be multiple connection operations, and it takes time to create the connection. At the same time, a connection will have packets, and the transmission network of multiple packets cannot be guaranteed to be consistent. All these affect the performance of our large amount of data insertion.

Jedis uses pipeline to insert 1W data

2.1 Background of Pipeline:

There are four processes for executing a command on the Redis client: sending command -> queuing command -> executing command -> returning result

This process is called Round trip time (RTT), mget, mset and other commands to save RTT. I understand this is analogous to connecting to mysql to obtain data. I/O data caused by multiple connections can degrade performance. However, most commands do not support bulk operations and need to consume N RTT, which is the time pipeline to solve the problem.

2.2 Pipeline performance

1. Execute N times without using pipeline

2, use pipeline to execute N times command:

2.3 Test: insert 1W pieces of data

Single thread:

@test void aaaa() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); @test void aaaa() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); Pipeline pipelined = jedis.pipelined(); jedis.flushDB(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); 00 IntStream. Range (0100). The forEach (I - > pipelined. Set (key + I + "", I +" ")); pipelined.syncAndReturnAll(); long endTime = System.currentTimeMillis(); System.out.println("exec time : " + currentThread().getName()+":"+(endTime - startTime)); } }.start(); Thread.sleep(40000); }Copy the code

Time unexpectedly came to dozens of milliseconds level:

exec time : Thread-2:55
Copy the code

Two thread tests:

The time changed to 70 milliseconds after inserting 10,000 pieces of data

exec time : Thread-3:71
exec time : Thread-2:72
Copy the code

Four threads:

You can see the time is getting longer:

exec time : Thread-4:100
exec time : Thread-5:101
exec time : Thread-2:103
exec time : Thread-3:103
Copy the code

Guess: may be the data volume is not large, resulting in multithreaded PROCESSING IO overhead occupy too much time. Keep testing!

2.4 Increase data to 1000W test

/** * @test void aaa() throws InterruptedException {jedis jedis = new Jedis("192.168.44.101", 6379); /** * @test throws InterruptedException {jedis jedis = new Jedis("192.168.44.101", 6379); jedis.flushDB(); Pipeline pipeline=jedis.pipelined(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); IntStream. Range (0100000). The forEach (I - > pipeline. The set (key + I + "", I +" ")); pipeline.syncAndReturnAll(); long endTime = System.currentTimeMillis(); System.out.println("exec time : " + currentThread().getName()+":"+(endTime - startTime)); } }.start(); Thread.sleep(500000); }Copy the code

After three tests, the final conclusion was that inserting 10 million pieces of data from a single thread would take 70s or so.

2.5 Multithreaded test for 1000W data

Two threads: Jedis wants to close the connection

@test void aaa() throws InterruptedException {Jedis Jedis = new Jedis("192.168.44.101", 6379); Jedis jedis2 = new Jedis("192.168.44.101", 6379); jedis.flushDB(); jedis.select(3); Pipeline pipeline=jedis.pipelined(); Pipeline pipeline2=jedis2.pipelined(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); IntStream. Range (0500000). The forEach (I - > pipeline. The set (key + I + "", I +" ")); pipeline.syncAndReturnAll(); long endTime = System.currentTimeMillis(); if(jedis ! = null){ jedis.close(); } system.out.println (" exec time :"+ currentThread().getName()+":"+(endtime-starttime)); } }.start(); new Thread() { @Override public void run() { String key = "32021420001:90000300009999:10001:1601198414621:"; long startTime = System.currentTimeMillis(); IntStream. Range (5000000100000). The forEach (I - > pipeline2. Set (key + I + "", I +" ")); pipeline2.syncAndReturnAll(); long endTime = System.currentTimeMillis(); if(jedis2 ! = null){ jedis2.close(); } system.out.println (" exec time :"+ currentThread().getName()+":"+(endtime-starttime)); } }.start(); Thread.sleep(500000); }Copy the code
Total exec time: thread-2:65198 total exec time: thread-3:65197Copy the code

Five threads:

Total exec time: thread-2:41,193 total exec time: thread-6:42,956 total exec time: thread-4:44,909 Thread-5:44900 total exec time: thread-3:44944Copy the code

[Conclusion analysis] : Multithreading can indeed reduce the storage time to a certain extent.

2.6 Expand your Thinking

There is no performance gap between the for loop and intStream.foreach () loop. But if you add parallel() to IntStream, you can multiply the speed depending on the number of cores in the CPU. But the operations performed in stream.parallel.foreach () are not thread-safe. An out-of-bounds exception occurred at the beginning of execution, so it is not considered.

IntStream.parallel().forEach()
Copy the code