The introduction

Hello, everyone, I am South orange, from contact with Java to now also have almost two years, two years, from a Java have several data structures do not understand super small white, to now understand a little bit of advanced small white, learned a lot of things. The more knowledge is shared, the more valuable, I this period of time summary (including from other big guy over there to learn, quote) some of the focus in the ordinary study and interview (self think), hope to bring some help to everyone

The appearance of this article, first of all, I would like to thank a person, the third prince Aobin, whose article made me find that Redis’ knowledge is so colorful. Yeah, well, I read his articles every week

This is the mind map for this article, because it is a free version of the software, so there are many watermarks, you can ask me for the original version

Students in need can add my public account, the latest articles in the future are in the first time, you can also ask me for mind mapRedis, because of the time and length of the reason, did not finish at one time, so, divided into two parts, did not read the first half of the friends can go to read ~

  • Some overlooked points in the index
  • Redis basic knowledge of two articles to meet (1)
  • Redis basic knowledge to meet two (2)
  • Everything — Locks in JAVA
  • Conquering the JVM — JVM Objects and object access location (1)
  • Conquer the JVM — Garbage collection for the JVM (Part 2)
  • Conquering the JVM — The JVM’s Garbage collector (part 3)

1. Redis basics

To learn the basic knowledge of Redis, first of all, it is necessary to know the five basic data structures of Redis. Let’s first introduce the usage scenarios of these five data structures and some easy points to appear in the job (interview)

1, 1 String The value is a String

Is the most basic data type in Redis. A key corresponds to a value

Application:

1. Cache: In classic usage scenarios, common information, strings, pictures or videos are put into Redis. Redis serves as the cache layer and mysql serves as the persistence layer to reduce the read and write pressure of mysql. 2. Counter: Redis is a single-threaded model. After one command is executed, the next one will be executed, and data can be dropped to other data sources in one step. 3. Session: Implements session sharing through RedisCopy the code

1, 2 Hash

For Java HashMap, it is itself a KV pair structure, such as value={{field1,value1},…… FieldN,valueN}}, very easy to understand

Application:

As a cache, HashMap saves more space than string to maintain cache information. It is suitable for storing user information, video information, etcCopy the code

The underlying implementation is a dict dictionary

1, 3 List

Redis’s LinkedList is the Java equivalent of LinkedList

Application:

1. List can be used as queue or stack in Redis. Its insertion and deletion operations are very fast with a time complexity of 0(1), but index positioning is very slow with a time complexity of O(n). 2. It can be used as the timeline of microblog. When someone publishes a microblog, lpush is used to add the timeline and display new list information. 3, can achieve blocking queue, left in right out of the queue group to complete the designCopy the code

List low-level use quickList(fast linked list) implementation

In earlier designs, the ziplist was used when the list object had a small number of elements, and the linkedList was used when the list object had a large number or a large number of elements.

Both methods of storage have their advantages and disadvantages

  • Two-way linkedList is convenient for push and POP operations at both ends of the table, and the complexity of inserting nodes is low, but its memory overhead is large. First, it holds two Pointers on each node in addition to the data. Secondly, each node of the bidirectional linked list is a separate memory block, and the address is discontinuous. More nodes are easy to produce memory fragmentation.
  • The ziplist is stored on a contiguous piece of memory, so it’s very efficient. However, it is not good for modification operations, and insertion and deletion operations require frequent memory allocation and release. Especially when the ziplist is very long, a realLoc may result in a large number of data copies.

After the 3.2 update, the underlying implementation of List became quickList

QuickList is a hybrid of a zipList and a linkedList. It splits the linkedList into segments, each of which is stored compactly using a zipList. Multiple Ziplists are concatenated using bidirectional Pointers.

1, 4

The Redis collection is the Java equivalent of a HashSet, with unordered, unique key-value pairs.

Its internal implementation is equivalent to a special dictionary, where all values are NULL. When the last element in the collection is removed, the data structure is automatically deleted and memory is reclaimed.Copy the code

Application:

1. Tag: Add tags to users, or users add tags to messages, so that those with the same or similar tags can recommend things or people to follow. 3. It can be used to store the user ID of winning in an activity. Because of the deduplication function, it can ensure that the same user will not win twice.Copy the code

1, 5 Zset ordered set

It is similar to a combination of Java SortedSet HashMap, in that it is a set that guarantees the uniqueness of internal values, and in that it assigns a score to each value, which represents the sorting weight of the value. Its internal implementation uses a data structure called a skip list.

Again, just like last time, I borrowed a picture from one of the big guys to show you, what is a skip list

From this picture, we can see that the bottom of the skip list is a sequential linked list, every other node has a pointer to the next node, and the layers recursively upward. This design is similar to the tree structure, so that the search for the linked list can reach the time complexity of binary search.

Skiplist does not require a strict correspondence between the numbers of the two layers of the linked list. Instead, he randomly selects a number of layers for each node. For example, if the number of layers randomly selected by the third node in the figure above is 4, then it is inserted into the space of the fourth layer, and the number of layers randomly selected by the fourth node is 1, then it only exists in the space of the first layer.

  • When there is less data, the Zset is implemented by a ziplist, just as the list level was before

A ziplist is a sequential storage structure composed of a series of specially encoded contiguous memory blocks. Similar to an array, a ziplist is stored contiguously in memory, but unlike an array, each element of a ziplist can take up a different amount of memory to save memory

The ziplist records the necessary offset information in each node to enable it to jump to the previous or next node.

  • When there is a lot of data, zset is implemented by a dict and a skiplist. Dict is used to query the mapping between data and scores, and skiplist is used to query data based on scores

In addition to these five basic data structures, Redis has more specialized data structures such as HyperLogLog, Geo, Pub\Sub, Pipeline, BloomFiler, which are useful in different places. Some I will introduce to you below, there are some in-depth I do not know so much, you can go to Aobin’s article to learn more

Pipeline

Multiple IO round trips can be reduced to one if there is no causal relationship between instructions executed by Pipleline

Pipeline can send multiple commands at a time and return the results after execution. Pipeline can reduce the round-trip delay time by reducing the communication times between the client and Redis. Moreover, the principle of pipeline implementation is queue, and the principle of queue is first-in, first-out. This ensures that the data is sequential.

Note: The Pipeline mechanism optimizes throughput but does not provide atomicity/transaction guarantees

2. Advanced knowledge

2. 1 Redis implements distributed locking

With the rapid development of Internet technology, more and more individual architectures have been transformed into distributed architectures. Distributed architectures can indeed bring about the improvement of performance and efficiency, but also bring the problem of data consistency.

Distributed lock is a special weapon to solve the data consistency in distributed architecture. Distributed lock needs to meet the following three aspects before it can be safely used:

  • Exclusivity: Only one client can acquire the lock at a time. No other client can acquire the lock at the same time

  • Avoid deadlock: The lock must be released (either normally or abnormally) after a limited period of time

  • High availability: The mechanism for acquiring or releasing locks must be highly available and perform well

At present, I know there are about three mainstream ways to implement distributed lock, which are Zookpeer, Redis, and local database. Today I will introduce how to implement distributed lock with Redis.

The locking mechanism based on Redis mainly depends on the atomic operation of Redis itself

Setnx fights for the lock and adds expire time with expire

If you are afraid of a lock failure, you can use a setnx and expire directive with a RedisTemplate wrapped in jedis.

However, such a distributed lock is not completely secure

First of all, the single point of failure is inevitable and second of all, the client that uses the lock is not in the same place as the Redis server. Time is delayed. We can only rely on the TTL command of Redis to query the remaining time of the lock, and then determine whether the lock timeout according to the remaining time. In ordinary computer systems, however, it is difficult to obtain a reliable time.

  • The system may synchronize time due to the time server,
  • The VM may adjust the time,
  • JVM GC can cause time to pause

How to do?

In fact, if we use a scenario that does not require such strong security accuracy, this would be enough to use, at least my current company (a top third-party payment company). However, programmer excellence is in the nature of programmers, and RedLock came along to solve this problem to some extent.

I don’t know much about RedLock, so I can only give you a brief introduction. The process is as follows:

  1. The client gets the current time and generates a random value as the value of the lock (in order to be more precise about the acquisition time).
  2. Try to get the same lock on all 5 Redis in sequence (using the same key and the same random value, similar to the single redis lock)

The lock acquisition operation itself needs to be set to a relatively small timeout (e.g. 5-50ms) to avoid wasting too much time on a dead Redis. If a Redis is not available, start trying the next one as soon as possible

  1. The client calculates how long it took to acquire the lock by subtracting the time obtained in step 1 from the current time

If the client acquires most of the locks on Redis (three to five fives) without exceeding the lock timeout, the lock is considered successful

  1. If the lock is successfully acquired, the valid time is calculated as the lock timeout – the time it took to acquire the lock
  2. If this fails, try unlocking on all Redis

(The unlocking operation is a piece of Lua script that removes a key if its value is a random value generated in step 1.)

Of course, it doesn’t solve the problem, but redis locks can only fail in extreme cases, and you can use a Redis cluster or RedLock if you don’t need to be very precise and you just need to be most reliable.

2.2 Redis cluster

Fixed the distributed locking problem, but it still doesn’t solve the problem of natural and man-made disasters, so Redis cluster is needed

Cluster synchronization mechanism

A primary node corresponds to one or more secondary nodes. The primary node provides data access, and the secondary node pulls data backup from the primary node. When the primary node fails, one of the secondary nodes will be selected as the primary node to ensure that the cluster will not fail. First, let’s talk about the Redis master-slave synchronization process:

  • 1. During the first synchronization, the secondary server sends the SYNC command to the primary server. After receiving the command, the primary server bgsaves the RDB file and records the subsequent modifications to the memory buffer
  • 2. After receiving the RDB image, the replication node loads the RDB image into the memory. After the loading is complete, the replication node notifies the primary node
  • 3. The subsequent incremental data can be synchronized through the AOF log, which is similar to the binlog of the database

At the same time, after the 2.8 version, Redis can automatically determine whether full synchronization or incremental synchronization is required, which is more efficient. Incremental synchronization actually sends PSYNC () command to the primary server when the new replication starts after the full synchronization (runid is the id of the primary server that was replicated last time). The primary server uses these two parameters to determine what kind of synchronization to do, whether the server ID is the same as the local one, and whether the replication offset is in the buffer.

High availability of the cluster

  • Redis Sentinal clusters focus on high availability, automatically elevating slaves to master and continuing to provide services if the master goes down
  • Redis Cluster focuses on scalability. When the memory of a single Redis is insufficient, a Cluster is used for fragmented storage

I can’t finish a chapter on these clusters, but I’ll cover them in a future article

2 and 3 Asynchronous queues

Redis’s job is to be a cache, but because of its versatility, it can also be a queue. There are some blocking apis that allow it to queue messages. In addition, other features of queuing messages such as FIFO (first in, first out) are easy to implement, requiring only a List object to fetch data from the beginning and plug data from the end

  • In Redis, if the List structure is used as a queue, rpush produces the message, LPOP consumes the message, and when LPOP has no message, it can be retried as sleep later, this is the producer consumption mode. Also, the List has a directive called blpop, which blocks when there is no message until it arrives.
  • Pub \sub topic subscriber mode, can realize 1 to N message queue, to achieve production once, consumption many times. However, it also has disadvantages, if let pub\sub theme subscriber mode, consumer offline situation, the message will be lost, instead of using MQ directly

2. 4 Delay queue

In Redis, it is possible to use sorted-set as a delayed queue

zadd key score1 value1 score2 value2

  • Socre indicates the execution time, key indicates the queue name, and value indicates the data
  • The consumer queue loop gets the lowest score (zrangebyScore) that is less than or equal to the current time from the sorted set based on the score
  • If you don’t get the data, take a nap

However, Redis’ delay queue does not return an ACK, so you need to implement it yourself

2, 5 Persistence

Redis has two types of persistence: RDB and AOF

RDB implements full persistence of image, and AOF implements incremental persistence. Because RDB takes a long time and is not real-time enough, a large amount of effective data will be lost when the redis instance is down, so AOF needs to be used together. When redis instance is restarted, RDB persisting files will be used to rebuild the memory. AOF is then used to replay the recent operation instructions to achieve a complete recovery of the state before the restart.

RDB mechanism

RDB persistence refers to writing a snapshot of the data set in memory to disk at a specified interval. In this way, the data in memory is written as a snapshot to a binary file named dump.rdb by default. The RDB provides three mechanisms to trigger persistence

  • 1. Triggering mode of Save: The client initiates a save request

    During the save command, Redis cannot process other commands until the RDB process is complete

  • 2. Bgsave triggering mode — The client initiates a BGSave request

    When this command is executed, Redis asynchronously performs snapshots in the background. Snapshots can also respond to client requests

  • 3, automatic trigger

    Automatic triggering is done by our configuration file, which is configured in the redis.conf file, but you can see that, so I won’t write that much here

AOF mechanism

Full backups are always time-consuming (random legends are always good??) Sometimes we provide a more efficient method, AOF, where Redis appends every write command it receives to a file via a write function, called logging.

Like RDB, AOF has three synchronization mechanisms:

  • 1. Always: Synchronous persistence Data changes are immediately recorded to the disk. Poor performance but good data integrity

  • 2, EverySEC: every second synchronization, asynchronous operation, record every second if the downtime within a second, there is data loss

  • 3. No: Never synchronizes

The mechanism of Redis itself is AOF persistence. If there are AOF files, the AOF files are loaded first. If the AOF file does not exist, the RDB file is loaded. After the AOF\RDB file is loaded, Redis starts successfully. When an error exists in the AOF\RDB file, Redis fails to start and prints an error message.

Don’t ask which one to use, AOF or RDB. In my experience, use both. RDB synchronization is fast, but a maximum of five minutes of content is lost. AOF synchronization is slow, but a maximum of one second of content is lost. The lost content can also be retrieved through logs.

The impact of machine power failure on data loss

If performance is not required, you can sync the disk for each write command to avoid data loss. However, it is impractical to sync the disk for each write command when performance is not required. In general, you use periodic sync, for example, every 1 second.

conclusion

Emmmm, finished the second article also slowly, in the process of writing an article, is really the more found himself knows, the more don’t understand, and because companies have to work overtime over the weekend, so it will be Redis is divided into two to complete, I hope you can like ~ ~ need a mind map at the same time, can contact me, after all, the more knowledge sharing more sweet!