What data types does Redis support? What are the application scenarios?

Redis supports five data types for its values, and redis keys are all strings.

  • String: redis string value can be a maximum of 512 MB. Can be used to do some counting function of the cache (is also the most common in real work).
  • List: A simple list of strings, sorted in insertion order, with an element added to either the top (left) or bottom (right) of the list. The underlying implementation is a linked list. Can realize a simple message queue function, do based on redis paging function.
  • Set: is an unordered collection of strings. Can be used for global de-weight etc.
  • Sorted set: Is an ordered collection of strings that give each element a fixed score to maintain order. Can be used to do ranking applications or range search.
  • Hash: a set of key-value pairs. It is a mapping table of keys and values of string type, that is, the Value stored in it is a KeyValue pair. It can be used to store information with a specific structure.

In fact, Redis also supports three special data types: BitMap, Geo and HyperLogLog

Generally speaking, it can be considered that Redis supports the above five data types, and its underlying data structures include: simple dynamic strings, linked lists, dictionaries, hop tables, integer sets and compressed lists.

Is Redis single threaded? Why is it so fast?

Redis is single-threaded, which means that the network request module uses a single thread, so there is no need to worry about concurrency security. But for compound operations that depend on multiple operations, locks are still required, and they may be distributed.

So why does single-threaded Redis execute so fast?

  • Memory based implementation, full memory calculation
  • Single thread operation, avoiding thread context switch operation
  • Multiplex I/O multiplexing thread model, the realization of a thread monitoring multiple IO streams, timely response to requests
  • Redis is a lightweight in-memory database with less external dependence

Redis thread model multiplex I/O multiplexing mechanism is an important and common research point. Currently support I/O multiplexing system calls include select, pSELECT, poll, epoll and other functions. I/O multiplexing is a mechanism by which a process can monitor multiple descriptors, and when a descriptor is ready to read or write, it can inform the application to do the corresponding read or write.

Multiplex I/O reuse mechanism compared with multi-process and multi-thread technology, the system does not need to create processes/threads, and do not need to maintain these processes/threads, thus greatly reducing the system overhead.

The features of common functions are as follows:

  • The select function:
    • Modifies the array of parameters passed in, which is very unfriendly for a function that needs to be called many times.
    • The maximum number of listening connections is 1024
    • If any sock(I/O stream) data is present, select does not return which data is returned
    • Thread unsafe (if you are listening on a socket in one thread and another thread wants to close the socket, the result is unpredictable)
    • “If a file descriptor being monitored by select() is closed in another thread, the result is Unspecified.”
  • The poll function:
    • – Removed the 1024 limit
    • The array of passed parameters is no longer modified
    • It’s still thread unsafe
  • Epoll function:
    • Epoll not only returns data in the socket group, but also determines which socket has data
    • Thread safety

The cache

Cache avalanche

For system A, assuming A daily peak of 5,000 requests per second, the cache would have been able to handle 4,000 requests per second at peak times, but the cache machine unexpectedly went down completely. The cache is down, and 5000 requests per second are sent to the database, so the database can’t handle it, so it’s going to give an alarm, and then it’s down. At this point, if no special solution is adopted to handle the failure, the DBA is anxious to restart the database, but the database is immediately killed by the new traffic.

This is cache avalanche.

Caching avalanches before and after the solution is as follows.

Ex ante: Redis high availability, master slave + Sentinel, Redis Cluster, avoid total crash. Issue: Local EhCache + Hystrix stream limiting & degrade to avoid MySQL being killed. After: Redis persistence, once restarted, automatically load data from disk, fast recovery of cached data.

The user sends A request. After receiving the request, system A checks the local EhCache first. If the request is not found, system A checks redis. If neither EhCache nor Redis exists, check the database and write the result in the database to EhCache and Redis.

The flow limiting component, can set the request per second, how many can pass through the component, the rest of the request did not pass, what to do? Go down! You can return some default values, either as a reminder, or blank values.

Benefits:

  • The database is never dead, and the flow limiting component ensures that only requests pass per second.
  • As long as the database is immortal, that is, part of the request can be processed for the user.
  • As long as some of the requests can be processed, it means your system is not dead. For the user, it may be a few clicks, but a few more clicks, it can be a page.

The cache to penetrate

For system A, assuming 5000 requests per second, 4000 of them are unfound

For example, the 4,000 requests that were made, they don’t show up in the cache, and every time you go to the database, they don’t show up.

Here’s an example. The database ID starts at 1, and the hacker sends all request ids with negative numbers. This way, there will be no cache, and the request will be queried directly from the database every time. Cache penetration in this malicious attack scenario would kill the database.

Each time system A does not find A value in the database, it writes A null value to the cache, such as set-999 UNKNOWN. Then set an expiration time so that the next time the same key is accessed, the data can be fetched directly from the cache before the cache expires.

Cache breakdown

Cache breakdown refers to a situation where a key is very hot and accessed frequently and is in centralized and high concurrency. When the key fails, a large number of requests will break through the cache and directly request the database, just like cutting a hole in a barrier.

The solution is also very simple, you can set the hotspot data to never expire; Or implement a mutex based on Redis or ZooKeeper and wait for the first request to build the cache before releasing the lock so that other requests can access the data through the key.

Double write consistency between database and cache:

High concurrent requests can easily lead to data inconsistency problems. If your business needs to ensure strong data consistency, it is recommended not to use caching. Failure to delete or write data to the database and cache will result in data inconsistency.

Solutions:

Solution to double deletion delay. You can delete the cache data, update the database data, and then delete the cache again at regular intervals. Update binlog subscriptions generated by the database (using Canal). Keep a record of the changed keys and try to delete the cache repeatedly (if the last delete failed)

What are the persistence methods of Redis?

RDB (snapshotting) (full persistence) :

The snapshot of the data set in the current memory is written to the disk for data persistence. The snapshot can be reloaded into the memory during recovery.

Trigger mode:

  • Automatic triggering: In the configuration file, you can configure how many times save is performed to trigger automatic persistence.
  • Manual trigger: The BGsave command is used to asynchronously generate snapshots in the background and respond to client requests. The child process is created through the fork operation of the Redis process. The snapshot generated is the responsibility of the child process. The client request is blocked only during the fork phase.

Snapshot recovery:

Move the backup file (dump.rdb) to the redis installation directory and start the service. Redis automatically loads the snapshot file data to memory. However, the Redis server blocks during the loading of the RDB file until the loading is complete.

Advantages and disadvantages analysis:

  • RDB persistence suffers from data loss because there is no way to implement real-time persistence. Bgsave forks every time it is run, which is a heavyweight operation. Frequent execution costs a lot and affects system performance. Automatic trigger can also lose some data.
  • RDB is faster than AOF in recovering large data sets.

AOF (append-only-file) (incremental persistence) :

AOF persistence can be set in the Redis configuration file’s APPEND ONLY MODE. The database state is recorded by recording write commands executed by the Redis server. The AOF file can be loaded into memory during recovery, and the AOF file can be repaired by redis-check-aof –fix.

AOF log rewrite:

  • AOF files grow larger as the server runs, and can be overridden to control the size of AOF files.
  • AOF overrides first read the existing key-value pair state in the database and then replace the previous key-value pair operations with a single command based on type.
  • Use the bgrewriteaof command to implement the AOF rewrite

AOF rewrite cache:

Redis works in a single thread and takes a long time to rewrite when AOF files are large. During the rewriting of AOF, Redis will be unable to process client requests for a long time. To solve this problem, the AOF rewrite can be executed in a child process with the following benefits:

  • The server process (parent) can continue to process other client requests while the child process does the AOF rewrite.
  • The child process has a copy of the parent process’s data, and using the child process instead of the thread ensures data security without using locks.

AOF overwriting in child processes

  • During the AOF rewrite, the server process can still handle other client requests, which can cause the database state to change, causing the current database data state to be inconsistent with the data in the rewritten AOF file.
  • There is an inconsistency between the AOF file and the data in the database.

Inconsistent data status Solution:

  • The Redis server sets up an AOF rewrite buffer. This buffer is used when the child process is created, when the Redis server executes a client-side write request command, which is then sent to the AOF rewrite buffer as well.
  • When the child process completes the AOF log rewrite, it sends a signal to the parent process. After receiving the signal, the parent process writes the contents of the AOF rewrite buffer to the new AOF file to maintain data consistency.

Analysis of the advantages and disadvantages of both:

  • AOF files can be persisted in seconds, written appending, readable and can be repaired using commands.
  • Compared with RDB files, AOF files with the same data are larger in size. Updating AOF files at the second level can affect performance at high redis loads

Persistence policy selection:

  • AOF is more secure and can synchronize data to files in a timely manner, but requires more DISK I/O. As AOF files are larger in size, file content recovery is slower and more complete.
  • RDB persistence, poor security, it is the best means of data backup and master-slave data synchronization in normal times, small file size and fast recovery.

Redis data expiration recycling strategy and memory elimination mechanism

The data expiration recycling strategy in Redis uses a combination of periodic deletion and lazy deletion.

Periodically delete:

Redis will spot check a certain amount of data at regular intervals to determine whether it is expired, and then delete it.

Lazy delete:

When obtaining a key, Redis checks whether the key has expired and deletes it if it has.

Memory flushing mechanism:

In the configuration file, we can configure the memory flushing mechanism. When memory usage reaches its maximum, redis can use the following cleanup strategy:

  • Volatile – lRU: uses the LRU algorithm to remove keys that have expired.
  • Allkeys-lru: Removes any keys using the LRU algorithm
  • Volatile -random: Removes random keys that have expired
  • Allkeys-random: removes random keys
  • Volatile – TTL: remove keys that are about to expire (minor TTL)
  • Noeviction: Does not remove any key, just returns a write error, default option

Primary and secondary replication

When the project is relatively large, we can use the Master/Slave mechanism of the Master/Slave architecture. The Master is mainly written and the Slave is mainly read. After the update of the Master node, it is automatically synchronized to the Slave node according to the configuration.

The principle of master-slave replication includes old version synchronization and command propagation. The cost of master-slave replication is that heavy system replication will cause master-slave delay, and according to CAP theory, service availability and data consistency cannot be guaranteed at the same time.

What is CAP theory?

CAP theory states that when network partitioning occurs, consistency and availability cannot be guaranteed simultaneously.

  • C: Consistent
  • A: Availability
  • P: Partition tolerance
  • Network partitioning: Nodes of distributed systems are often distributed on different machines for network isolation, which means there is a risk of network disconnection, which means that network partitioning occurs.
  • Final consistency: Redis guarantees final consistency, slave nodes will try to catch up with the master node, and eventually the state of the slave node will be the same as that of the master node.

Redis supports transactions

Redis transaction support can be summarized as follows:

  • Isolation: Redis is a single-process program that guarantees that the transaction will not be interrupted during the execution of the transaction, and the transaction can run until all the commands in the transaction queue are executed. So Redis transactions support isolation.
  • Redis serializes all the commands in a transaction and executes them sequentially. It is not possible for Redis to interject a request from another client in the middle of a transaction. Redis is guaranteed to execute these commands as a separate isolated operation.

Redis commands for handling transactions are as follows:

  • MULTI: Marks the start of a transaction block.
  • EXEC: Executes all commands within the transaction block.
  • DISCARD: Cancels a transaction, or abandons all commands in a transaction block.
  • UNWATCH: Disables the WATCH command to monitor all keys.
  • WATCH key [key …] : Monitors a key (or keys) and interrupts a transaction if the key (or keys) is changed by another command before the transaction executes.

Note that redis transactions do not support rollback operations. Redis starts a transaction with MULTI, enlists multiple commands into the transaction, and finally triggers the transaction with EXEC, which executes all commands in the transaction. The redis command will fail only if the command is called with a syntax error, or if an operation is performed on a key that does not comply with its data type. However, these problems should and can be detected when the command is queued, so redis transactions do not support rollback operations.

Subsequent updates

  • The guard mode
  • A distributed lock
    • For distributed locking, in order to make the locking operation atomic, it is not possible to use multiple commands, we can use the set command with multiple parameters to do this, as shown below: jedis.set(String key, String value, String nxxx, String expx, int time)
    • The first one is key, and we use key as the lock because key is unique.
    • RequestId = value; requestId = value; requestId = value; requestId = value; requestId = value;
    • The third parameter is NXXX. We fill in NX for this parameter, which means SET IF NOT EXIST, that is, when key does NOT EXIST, we perform SET operation; If the key already exists, no operation is performed.
    • Expx = PX; expx = PX; expx = PX;
    • The fifth parameter is time, which corresponds to the fourth parameter and represents the expiration time of the key.