The basic concept
-
Large memory – based instances may cause a series of potential problems during instance recovery and master-slave synchronization
- For example: recovery time increases
- The primary/secondary switchover costs a lot
- Buffer overflow prone
-
Pika\, developed by 360’s DBA and infrastructure group
-
Pika target (using SSD smoothing instead of Redis) \
-
A single instance can hold a large amount of data while avoiding the potential problems of instance recovery and master-slave synchronization
-
Compatible with Redis data types, allowing smooth migration of applications using Redis to Pika
-
Potential problems with large memory Redis instances
-
Potential problems with large memory
-
RDB Generates and restores inefficient snapshots
- A long fork causes the main thread to block
- The swap memory may be switched to disk
-
The full synchronization duration increases, causing buffer overflow
- A large number of RDB files are synchronized in full. As a result, the synchronization duration increases
- The primary/secondary switchover takes longer, which also affects service availability
-
Pika overall architecture
-
The overall architecture
- Network framework
- Pika thread module
- Nemo Storage module
- RocksDB
- binlog
-
Network framework
-
Function: Receives and sends requests from the underlying network
-
Implementation:
-
The network function at the bottom of the operating system is encapsulated socket
-
The Pika thread module uses a multithreaded model to deal specifically with client request \
-
Request DispatchThread \
-
A set of workerthreads (encapsulating requests into tasks) \
-
ThreadPool \
-
-
-
Tuning: Increase the number of worker threads and the number of threads in the thread pool
-
-
Nemo
- The data type compatibility of Pika and Redis is realized, and the learning cost of Pika is reduced
-
binlog
- Record write command, used for command synchronization between master and slave nodes (avoid large memory replication, command is much smaller than data) \
How does Pika store more data based on SSDS?
-
Basic concept: RocksDB, a persistent key value database widely used in the industry, is used
-
RocksDB’s read and write mechanism (which does not take up much memory) \
-
RocksDB uses two small memory Spaces to cache the written data alternately (Memtable1, Memtable2) \
- Usually several MB, tens of MB
-
Memtable1 is written first, and Memtable1 is written to SSD
-
Now Memtable2 will replace Memtable1
-
Wait until Memtable1 data is written and Memtable2 is full, then switch to Memtable1
-
-
Why doesn’t PIKA have problems with large file synchronization efficiency and memory overflow
-
The data files are saved based on RocksDB and no longer need to recover from memory snapshots
-
Implementing incremental command synchronization saves memory and avoids the problem of buffer overflow
-
-
The advantage of pika
-
Pika uses RocksDB to save large amounts of data to SSD while avoiding the generation and recovery problems of memory snapshots \
-
Pika uses the binlog mechanism for master/slave synchronization to avoid the impact of large memory \
-
\
How does Pika implement Redis data type compatibility?
-
Basic concept: RocksDB provides only single-valued key-value pair types that only satisfy redis’s String data structure
-
The Nemo module converts the collection type of Redis into a single-valued key-value pair \
-
Redis collection type
-
The List and Set types also have single-value \ in their collections
-
The Hash (field-value) and Sorted Set (member-score) types, whose sets of elements are paired \
-
-
The list of the conversion
- Key: Ensures that multi-bit components are stored in a meaningful order in the list
- Value: preceding and succeeding elements, lifetime, value, version, lifetime
-
The set of transformations
- Key: Saves the key and value of a set
- Value: saves the version and lifetime
-
Transformation of the hash
- Key: size hashkey field1
- Value: value1, field2, value2…..
-
zset
- It’s like a hash, but sort by score
-
The list of the conversion
\
Other advantages and disadvantages of Pika
-
The advantages of pika
-
Instance restart fast get data directly from SSD, no need to play back data \
-
Full synchronization is low risk, incremental synchronization with binlog (disk) and no buffer size limitation
-
Multithreaded model reduces the performance impact of read and write SSDS on PIKA
-
-
The disadvantage of pika
-
The access performance is lower than redis
- Move storage from cache to memory to SSD
- Recording binlog is inefficient
- I feel the data structure is inefficient
-
-
Application scenario: Keeping large volumes of data is our primary need, so Pika is a good solution
\
conclusion
-
Pika advantages
-
Not only support Redis operation interface, but also support to save large amount of data
-
Support for migrating Redis
-
-
tuning
- Increase the amount of thread data to improve the processing capability of concurrent requests
- SSDS with high configuration are used to improve SSD access performance
-
Redis migration pika
-
Redis data is migrated to Pika\
- Aof_to_pika -i [Redis AOF file] -h [Pika IP] -p [Pika port] -a [authentication information]
-
Forward the Redis request to Pika\
-
-
Github.com/Qihoo360/pi… \
\