Message storage format looks at file programming

Learn file programming from the commitlog file design

We know that full RocketMQ messages are stored in commitlog files, and each message is of different sizes. How do we organize the messages? After a message is written to a file, what about the start and end of a message?

Based on the programming model of files, we need to define a set of message storage formats to represent a complete message. For example, RocketMQ’s message storage format is shown in the following figure:

From here we can derive a general data storage format definition practice: Usually the storage protocol follows the Header + Body, and the Header part is fixed length, stores some basic information, and the Body stores data. In RocketMQ’s message storage protocol, we can think of the size of the message Body as the four bytes of Header, All the following fields are considered to be message-related business attributes and can be assembled in the specified format.

For the Header + Body protocol, we typically extract a message in two steps. The Header is read into ByteBuffer, and in RocketMQ, the message Body reads the length of a message, and then the length of the message is read from the beginning of the message