First, Redis encapsulation architecture explanation
In fact, newLife. Redis is a complete implementation of the Redis protocol, but the Core Redis functionality is not in this, but in newLife. Core.
Newlife. Core has a newLife. Caching namespace, which contains a Redis class that implements the basic functions of Redis. Another class is RedisClient which is the client of Redis.
The core function of Redis is to implement these two classes. RedisClient represents a connection between the Redis client and the server. When Redis is actually used, there is a Redis connection pool that holds a number of RedisClient objects.
Therefore, there are two layers of Redis encapsulation. One layer is Redis and RedisClient in newLife. Core. The other layer is newLife.redis. The FullRedis here is the realization of all the advanced functions of Redis.
Here you can also think of newLife. Redis as an extension of Redis.
Test example to explain the basic use of Redis
1, the instance,
Open program.cs and take a look at the code:
Here the -xtrace. UseConsole (); Yes Output logs to the console for debugging to view the results.
Let’s take a look at the first example, Test1, which I’ve commented out in the code:
When Set, Redis stores string or character data directly (string internal mechanism is also binary), and other types of data will default to JSON serialization and then save.
If it is a string or character, the data will be fetched directly. If it is any other type, it will be json deserialized.
Set Expiration time of the third parameter in seconds.
Vs debug tip, pressing F5 or directly to the toolbar “Start” will compile the entire solution slowly (vs default), you can right-click the project and choose Debug -> Start new instance, will only compile the project that will be used, which will be much faster for debugging.
After debugging, you can see the output from the console: the arrow to the right = is the output from ic.log = xtrace.log.
Use of dictionaries: objects, need to extract all json, and then converted to objects, and dictionaries, you can directly fetch a field.
Queues are implemented by the List structure and can be used when there is too much upstream data for the downstream to handle. Upstream data is sent to the queue and slowly consumed downstream. Another application, cross-language collaboration, is that a program implemented in another language fills a queue with data, which is then consumed in another language. This approach is similar to the CONCEPT of MQ, which is a little low, but also useful.
Sets, more commonly used in a de-weighting function that requires accurate judgment. For example, we have 30 million orders every day, and these 30 million orders can be repeated. At this time, I want to place a total number of orders. It is impossible to directly database group by at this time, because there are more than ten tables in the database. Here is my practical experience:
For example, when the merchant delivers the goods, the outlet needs to take the goods back, but before taking them back, the outlet does not know how many goods it has. At this time, we make a function, that is, the order will be sent to our company. We will build a set of time_site keys, and the set itself has the function of deduplication, and we can easily Count the number through the set.count function, when the item is collected, we will Remove the item from the set in the background. And then the Set contains the items that have not been collected by the branch, and the Count tells you how many items have not been collected by the branch today. In practice, this number is relatively large, because there are tens of thousands of outlets.
Redis in the Bloom filter, to heavy, when the interview asked more.
Small experience sharing:
Processing of illegal time in the database: judge whether the year in the time is greater than 2000, if less than 2000, it is considered illegal; Accustomed to greater than or less than the sign is not used to equal to the sign, so that you can deal with a lot of unexpected data;
When Set, it is best to specify the expiration time, in case we forget to delete some data that needs to be deleted.
Redis asynchronous as far as possible, because Redis delay itself is very small, about 100US-200us, another is Redis itself is single-threaded, asynchronous task switching time is larger than the network time;
Usage: In the Internet of things, when a large amount of data is uploaded, we can put these data in the List of Redis first, for example, 10,000 pieces per second, and then take them out in batches and insert them into the database in batches. In this case, the key should be set, and the prefix + time can be used to remove the processed List.
2. Stress test
Let’s look at the fourth example, where we do the stress test directly, with the following code:
The result is as follows:
Tests are operations like get,set, remove, add, etc. As you can see, I easily got up to 600,000 on my machine, and even more than a million when I was multithreading.
Why the high Ops? Here is what to say to you:
Bench adds, subtracts and modifs stress tests in groups based on the number of threads;
Rand parameter, whether the key/value is randomly generated;
Batch Specifies the batch size. Read and write operations are performed in batches, which is optimized using GetAll/SetAll.
3, Redis NB function to improve performance
The above operation if you have mastered the basic Redis entry, then proceed to the next step. If you know all about it, you’re almost ahead of everyone else.
The GetAll () and SetAll ()
GetAll: Let’s say I want to get ten keys. GetAll is used. At this point Redis executes a command. Let’s say I want to get 10 keys and I want to get 10 keys, and I want to get 10 keys, and I want to get one key if I want to get one key. The time of one Getall is about a few times that of get, but the time of 10 GET is 10 times. You should be able to calculate this account, right? Getall is highly recommended.
Setall, like Getall, sets k-V in batches.
The performance of Setall and Getall is terrible, the official Ops is only about 100,000, why our test is easy to 500,000 or even millions? Because we used setall Getall. If get and set more than twice, getall and setall are recommended.
Redis pipeline Pipelin
For example, the command executed 10 times will be packaged into a package and sent to be executed collectively. The implementation method here is StartPipeline(), StopPipeline() and the middle code will be executed in the form of pipeline.
The more powerful weapon, the AutoPipeline automatic pipe attribute, is recommended here. When the pipe operation reaches a certain number, it is automatically committed, default is 0. With AutoPipeline, there is no need for StartPipeline, StopPipeline specifies the start and end of the pipeline.
The Add and Replace
Add: Redis does not have this Key to Add, do not Add, return false;
Replace: If there is a Replace, the original value is returned. If there is no Replace, no operation is performed.
Add and Replace are the key to Redis distributed lock.
Third, Redis use skills, experience sharing
In the Readme for the project, here’s an excerpt:
1, features,
ZTO is widely used in real-time computing of big data. More than 200 Redis instances have been working stably for more than one year, processing nearly 100 million package data every day, and adjusting consumption 8 billion times a day.
Low latency, average Get/Set operation time 200~600us (including round-trip network communication).
Large throughput, its own connection pool, supports a maximum of 1000 concurrent;
High performance, support binary serialization (default use JSON, JSON is very inefficient, conversion to binary performance will improve a lot).
2. Redis experience sharing
On Linux, the number of instances is equal to the number of processors. The maximum memory of each instance is the local physical memory, avoiding the memory overflow of a single instance (for example, 8 core processors, then deploy 8 instances).
The massive data (1 billion +) according to the key hash (Crc16/Crc32) stored on multiple instances, read and write performance increased exponentially.
Binary serialization is used, while Json serialization is very common.
Reasonably design the Value size of each pair of keys, including but not limited to the use of batch acquisition. The principle is to control each network packet around 1.4K bytes and reduce the number of communication (actual experience is tens of K, hundreds of K is no problem).
The average Get/Set operation on a Redis client takes 200-600us (including round-trip network traffic) to evaluate the network environment and Redis client components (if not, look at the network, serialization, etc.).
Merge a batch of commands using a Pipeline.
The main performance bottlenecks of Redis are serialization, network bandwidth, and memory size, and the processor can also reach bottlenecks when abused.
Other optimization tips are available.
The above experience, derived from more than 300 instances of more than 4T space more than one year of stable work experience, and according to the importance of the order, can be used according to the needs of the scene.
3. Cache Redis siblings
Redis implements ICache interface, its twin MemoryCache, MemoryCache, ten million level throughput rate.
It is strongly recommended to use ICache interface encoding for applications and MemoryCache for small data. After the data increases (100,000), the implementation is changed to Redis without modifying the business code.
Iv. Reply to some questions
In this Part, we will talk about the experience of Redis in big data:
Q1: How to set more than one key for one data?
A1: If the performance requirements are not very high, just serialize the entities with JSON, there is no need to use dictionaries for storage.
Q2: What’s the difference between queues and lists? Is it better to use List or queue if you’re going in left and out right?
A2: Queues are implemented using lists and are encapsulated based on lists. Left in, right out, straight into line. Redis has an interesting List structure, which can be left in and right out as well as right in and left out. So it can implement a list structure, a queue, and a stack.
Q3: Do classes that hold multiple fields perform equally?
A3: There will be no deviation in most scenarios, but there may be some deviation in scenarios with a large amount of data in large companies.
Q4: After big data is written into the database, for example, when the data reaches more than 100 million, could you share some experience in statistical analysis and query?
A4: separate table and library, split to less than 10 million.
Q5: Why have CPUS skyrocketed?
A5: The programmer’s ultimate philosophy — 100% CPU, then maximum performance, try not to waste. Worst of all – if the CPU is less than 100 percent and the performance cannot be improved, there is a problem with the code.
Although we can use Redis, we may not have such big data usage scenarios at ordinary times. I hope this article can give you some experience worth using for reference.