This article focuses on redis RESP2 protocol analysis

The overall analysis

From an overall perspective, the Redis client and server interactions are actually two:

  • Client sends:
    • The command
    • Cache key, cache value
  • The server sends:
    • The result returned to the client
    • successfulOKFlag “String”
    • Number of elements successfully written
    • Error message”ERR

Overall, the interactive content is divided into the following points:

  1. Command:Set, Get
  2. Cache keys:Set key valueIn thekeyCan only be a string;
  3. Caching a single value:Set key valueIn thevalue. Strings, numbers, things like that;
  4. Cache collection:HSet key value1 value2. When multiple values are cached, different types of values may be transferred;
  5. OK: The command is executed successfullyOKReply;
  6. Int: Write N elements successfully, returnNCorresponding int value
  7. ERR: Error message

Here is the command form. Let’s talk about the interactive data:

First of all, RESP is based on TCP. One of the most important issues to consider is sticking and unpacking packets. So how does Redis Server distinguish the correct packets? There are usually 3 ways:

  1. Packet of fixed length, incomplete, over truncation
  2. Delimiter distinction,HTTPIt’s just a line break
  3. The header of the packet sets the data length

Custom protocol

Redis uses a custom protocol format to distinguish between different command requests. For example:

SET key value

=> 

*3\r\n$3\r\nSET\r\n$9\r\nkey\r\n$6\r\nvalue\r\n
Copy the code

I don’t know what this is.

  1. \r\n: the separator
  2. * 3The command has three parameters
  3. $3...: String length of the current parameter

It is a matter of distinguishing between sending character boundaries, then determining how many strings are sent, and then how long each string is.

Finally, here are the response protocols:

  • +: Single line/status reply
  • -: Error reply
  • :: integer reply
  • $: Batch reply
  • *: Returns multiple batches

Then let’s look at the various return cases:

Return line

> SET foo tests
OK

----------
+OK\r\n
Copy the code

Error response

> PUT foo tests
(error) ERR unknown command `PUT`

----------
-ERR unknown command `PUT`
Copy the code

Integer reply

> EXISTS foo
(integer) 1

----------
:1\r\n
Copy the code

Batch reply

> GET foo
"0123abcdefghigkl"

----------
$16\r\n0123abcdefghigkl\r\n

> GET foo2
(nil)

----------
$-1\r\n
Copy the code

Multiple batch reply

> LRANGE foo2 0 -1
1) "wxyz"
2) "qrstuv"
3) "lmnop"

------------
*3\r\n$4\r\nwxyz\r\n$6\r\nqrstuv\r\n$5\r\nlmnop\r\n
Copy the code

disadvantages

The five encoding types mentioned above only support integers and strings from the base data type, but for the other base data types: floating-point or Booleans (which are currently replaced by string/ INTEGER). The disadvantages include the following:

  • lrange/zrangeIf the contents are the same, the results returned are consistent. But the structure of the store is different, and the return value is not clear to the developer;
  • Lack of important basic data types;
  • The data format returned is not binary safe. Because the separator is\r\n, so the error content should not contain this information

To be continued…