The basic concept

Large memory – based instances may cause a series of potential problems during instance recovery and master-slave synchronization
- For example: recovery time increases
- The primary/secondary switchover costs a lot
- Buffer overflow prone
Pika\, developed by 360’s DBA and infrastructure group
Pika target (using SSD smoothing instead of Redis) \
- A single instance can hold a large amount of data while avoiding the potential problems of instance recovery and master-slave synchronization
- Compatible with Redis data types, allowing smooth migration of applications using Redis to Pika

Potential problems with large memory Redis instances

The overall architecture
- Network framework
- Pika thread module
- Nemo Storage module
- RocksDB
- binlog
Network framework
- Function: Receives and sends requests from the underlying network
- Implementation:
  - The network function at the bottom of the operating system is encapsulated socket
  - The Pika thread module uses a multithreaded model to deal specifically with client request \
    - Request DispatchThread \
    - A set of workerthreads (encapsulating requests into tasks) \
    - ThreadPool \
- Tuning: Increase the number of worker threads and the number of threads in the thread pool
Nemo
- The data type compatibility of Pika and Redis is realized, and the learning cost of Pika is reduced
binlog
- Record write command, used for command synchronization between master and slave nodes (avoid large memory replication, command is much smaller than data) \

Basic concept: RocksDB, a persistent key value database widely used in the industry, is used
RocksDB’s read and write mechanism (which does not take up much memory) \
- RocksDB uses two small memory Spaces to cache the written data alternately (Memtable1, Memtable2) \
  - Usually several MB, tens of MB
- Memtable1 is written first, and Memtable1 is written to SSD
- Now Memtable2 will replace Memtable1
- Wait until Memtable1 data is written and Memtable2 is full, then switch to Memtable1
Why doesn’t PIKA have problems with large file synchronization efficiency and memory overflow
- The data files are saved based on RocksDB and no longer need to recover from memory snapshots
- Implementing incremental command synchronization saves memory and avoids the problem of buffer overflow
The advantage of pika
- Pika uses RocksDB to save large amounts of data to SSD while avoiding the generation and recovery problems of memory snapshots \
- Pika uses the binlog mechanism for master/slave synchronization to avoid the impact of large memory \

Basic concept: RocksDB provides only single-valued key-value pair types that only satisfy redis’s String data structure
The Nemo module converts the collection type of Redis into a single-valued key-value pair \
- Redis collection type
  - The List and Set types also have single-value \ in their collections
  - The Hash (field-value) and Sorted Set (member-score) types, whose sets of elements are paired \
- The list of the conversion
  - Key: Ensures that multi-bit components are stored in a meaningful order in the list
  - Value: preceding and succeeding elements, lifetime, value, version, lifetime
- The set of transformations
  - Key: Saves the key and value of a set
  - Value: saves the version and lifetime
- Transformation of the hash
  - Key: size hashkey field1
  - Value: value1, field2, value2…..
- zset
  - It’s like a hash, but sort by score

The list of the conversion

The advantages of pika
- Instance restart fast get data directly from SSD, no need to play back data \
- Full synchronization is low risk, incremental synchronization with binlog (disk) and no buffer size limitation
- Multithreaded model reduces the performance impact of read and write SSDS on PIKA
The disadvantage of pika
- The access performance is lower than redis
  - Move storage from cache to memory to SSD
  - Recording binlog is inefficient
  - I feel the data structure is inefficient
Application scenario: Keeping large volumes of data is our primary need, so Pika is a good solution

Pika advantages
- Not only support Redis operation interface, but also support to save large amount of data
- Support for migrating Redis
tuning
- Increase the amount of thread data to improve the processing capability of concurrent requests
- SSDS with high configuration are used to improve SSD access performance
Redis migration pika
- Redis data is migrated to Pika\
  - Aof_to_pika -i [Redis AOF file] -h [Pika IP] -p [Pika port] -a [authentication information]
- Forward the Redis request to Pika\
Github.com/Qihoo360/pi… \