Principle of unpacking

The unpacking principle of the last blog post netty source code analysis of the mysteries of the unpacking machine has been explained in detail, here is a simple summary: Netty’s unpacking process and write their own manual unpacking is no different, is the accumulated bytes into a container, determine whether the current accumulated byte data reached the size of a package, reached the size of a package to open, and then passed to the upper-layer business decoding handler

The reason why netty’s unpacking is so powerful is because netty abstracts how to unpack into a decode method. Different unpacks implement different decode methods, so they can realize different protocols of unpacking

This article to speak is general unpacker LengthFieldBasedFrameDecoder, if you are still in their implementation human flesh unpacking, might as well know the unpacker this powerful, because almost all related to the length of the binary protocol can be implemented by TA, let’s first take a look at what he USES

The use of the LengthFieldBasedFrameDecoder

1. Unpack based on length

The first few bytes represent the length of the packet (excluding the length field), followed by the actual data. After dismantling the packet is a complete packet with a length field (can be passed to the application layer after the decoder to decode), create a LengthFieldBasedFrameDecoder such agreements can be achieved as follows

new LengthFieldBasedFrameDecoder(Integer.MAX, 0.4);
Copy the code

1. The first parameter is maxFrameLength, which indicates the maximum length of the package. Netty will do something special if it exceeds the maximum length of the package. The second parameter refers to the offset of the length field lengthFieldOffset, which in this case is 0, indicating no offset 3. The third parameter refers to the lengthFieldLength, which is 4, indicating that the length field is 4

2. Truncation and unpacking based on length

If our application layer decoder does not need to use the length field, then we want netty to look like this after unpacking

The length field is cut off, and we only need to specify another parameter. Initialbytest. this parameter indicates how many bytes netty should skip before sending a complete data packet to the business decoder

new LengthFieldBasedFrameDecoder(Integer.MAX, 0.4.0.4);
Copy the code

In front of the meanings of the three parameters is the same as the above, the fourth parameter to tell behind us, and here is the fifth parameter initialBytesToStrip, here is 4, said after obtain a complete packet, ignore the previous four bytes, applying the decoder to get is not taking the length of the packet

3. Unpack based on the offset length

Binary protocols are more common in this way. The first few fixed bytes represent the protocol header, usually containing meta information such as magicNumber or protocol version, followed by a length field indicating how many bytes the package body contains

Just adjust the second parameter based on the first case

new LengthFieldBasedFrameDecoder(Integer.MAX, 4.4);
Copy the code

LengthFieldOffset is 4, indicating that the length field is skipped after 4 bytes

4. Unpacking based on adjustable length

In some cases, binary protocols may be designed as follows

So the length field is first, the header is second, and how do we adjust the parameters to get the unpacking effect that we want?

1. The length field at the beginning of the packet indicates that there is no offset, and the lengthFieldOffset is 0 2. The length field is 3, so lengthFieldLength is 3 2. The length field represents the length of the package body, which skips the header. There is another parameter called lengthAdjustment, the size of the package lengthAdjustment. The length field value plus this correction represents the package with the header, which is 12+2

Finally, the code implements as

new LengthFieldBasedFrameDecoder(Integer.MAX, 0.3.2.0);
Copy the code

5. Truncated unpacking based on offset adjustable length

A more perverted binary protocol has two headers, such as the one below

After the disassembly, HDR1 is discarded, and the length field is discarded, leaving only the second header and the body of the valid package. In this protocol, HDR1 generally represents magicNumber, indicating that the application only accepts binary data starting with the magicNumber. This is commonly used in RPC

We can still do this by setting the netty parameters

1. If the lengthFieldOffset is 1, then the lengthFieldOffset is 1 2. If the length field is 2, then lengthFieldLength is 2, 3. The length of the package body represented in the length field skips HDR2, but HDR2 is also removed by Netty as part of the package body when unpacking. The length of HDR2 is 1, so lengthAdjustment is 1 4. Initialbytesttest3 is 3, because the first three bytes are truncated

Finally, the code implements as

   new LengthFieldBasedFrameDecoder(Integer.MAX, 1.2.1.3);
Copy the code

6. Truncated unpacking based on offset adjustable variable length

All the preceding length fields represent the length of the packet body without the header. If the meaning of the length field includes the length of the entire packet, as in the following case

The length of the field is 16, the length of the field is 2, the length of HDR1 is 1, the length of HDR2 is 1, and the length of the package body is 12. 1+1+2+12=16.

This is the same except that the length field is different from the previous case, because Netty does not know the business situation, you need to tell Netty how many bytes after the length field can form a complete packet, here is obviously 13 bytes, and the length field is 16, Therefore, subtract 3 is the true length required for unpacking, and lengthAdjustment is -3

The six cases here are the six typical binary protocols in netty source code, which covers more than 90% of the scenarios. If your protocol is based on length, you can consider not using bytes to implement it, but using it directly, or inherits it, and making some simple changes

The implementation of such a powerful unpacker is also very elegant, so let’s see how netty implements it

LengthFieldBasedFrameDecoder source analysis

The constructor

About LengthFieldBasedFrameDecoder constructor, we only need to see a is enough

public LengthFieldBasedFrameDecoder(
        ByteOrder byteOrder, int maxFrameLength, int lengthFieldOffset, int lengthFieldLength,
        int lengthAdjustment, int initialBytesToStrip, boolean failFast) {
    // Omit parameter verification
    this.byteOrder = byteOrder;
    this.maxFrameLength = maxFrameLength;
    this.lengthFieldOffset = lengthFieldOffset;
    this.lengthFieldLength = lengthFieldLength;
    this.lengthAdjustment = lengthAdjustment;
    lengthFieldEndOffset = lengthFieldOffset + lengthFieldLength;
    this.initialBytesToStrip = initialBytesToStrip;
    this.failFast = failFast;
}
Copy the code

What the constructor does is simply store the arguments passed in to fields. Most fields have been explained earlier, but only a few more instructions are provided below. 2. LengthFieldEndOffset specifies the offset of the entire packet from the first byte following the length field. 3. FailFast, if true, specifies that the TA value in the length field is read beyond maxFrameLength. False means that the TooLongFrameException will be thrown only after the bytes represented by the length field value have been read. By default, the value is set to true. It is recommended not to change this, otherwise it may cause memory overflow

Implement the unpacking abstraction

In netty source code analysis of the unpacker mystery, we already know that the specific unpacking protocol only needs to be implemented

void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) 
Copy the code

In represents the data that has not been unpacked so far, and the unpacked package can be added to the out list to realize the downward passing of the package

The first layer is simpler to implement

@Override
protected final void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {
    Object decoded = decode(ctx, in);
    if(decoded ! =null) { out.add(decoded); }}Copy the code

The overloaded protected function decode does the actual unpacking. Here’s a three-part look at this heavyweight function

Get frame length

1. Obtain the size of the package to be unpacked

// If the current readable bytes do not reach the offset of the length field, it means that the length field cannot be read, do not read directly
if (in.readableBytes() < lengthFieldEndOffset) {
    return null;
}

// Get the actual byte offset of the length field
int actualLengthFieldOffset = in.readerIndex() + lengthFieldOffset;
// Get the actual unadjusted packet length
long frameLength = getUnadjustedFrameLength(in, actualLengthFieldOffset, lengthFieldLength, byteOrder);


// If the length is negative, skip the length field and throw an exception
if (frameLength < 0) {
    in.skipBytes(lengthFieldEndOffset);
    throw new CorruptedFrameException(
            "negative pre-adjustment length field: " + frameLength);
}

// Adjust the length of the package
frameLength += lengthAdjustment + lengthFieldEndOffset;

Copy the code

The above section has an extension point getUnadjustedFrameLength, which you can override if your length field represents values that are not normal int,short, or other basic types

protected long getUnadjustedFrameLength(ByteBuf buf, int offset, int length, ByteOrder order) {
        buf = buf.order(order);
        long frameLength;
        switch (length) {
        case 1:
            frameLength = buf.getUnsignedByte(offset);
            break;
        case 2:
            frameLength = buf.getUnsignedShort(offset);
            break;
        case 3:
            frameLength = buf.getUnsignedMedium(offset);
            break;
        case 4:
            frameLength = buf.getUnsignedInt(offset);
            break;
        case 8:
            frameLength = buf.getLong(offset);
            break;
        default:
            throw new DecoderException(
                    "unsupported lengthFieldLength: " + lengthFieldLength + " (expected: 1, 2, 3, 4, or 8)");
        }
        return frameLength;
    }
Copy the code

For example, some strange length field is 4 bytes, such as 0x1234, but TA means 10, that is, the length is the decimal 1234, then overwriting this function can realize the strange length field unpacking

2. Length check

// If the length of the entire packet is not as long as the length field, throw an exception
if (frameLength < lengthFieldEndOffset) {
    in.skipBytes(lengthFieldEndOffset);
    throw new CorruptedFrameException(
            "Adjusted frame length (" + frameLength + ") is less " +
            "than lengthFieldEndOffset: " + lengthFieldEndOffset);
}

// The packet length exceeds the maximum length, and the packet is discarded
if (frameLength > maxFrameLength) {
    long discard = frameLength - in.readableBytes();
    tooLongFrameLength = frameLength;

    if (discard < 0) {
        // The current number of readable bytes has reached frameLength, so skip frameLength directly. After discarding frameLength, there may be a valid packet behind it
        in.skipBytes((int) frameLength);
    } else {
        // The current number of readable bytes does not reach frameLength, indicating that the unread bytes need to be discarded
        discardingTooLongFrame = true;
        // bytestdiscard indicates how many bytesToDiscard
        bytesToDiscard = discard;
        in.skipBytes(in.readableBytes());
    }
    failIfNecessary(true);
    return null;
}

Copy the code

Finally, call failIfNecessary to determine if an exception needs to be thrown

private void failIfNecessary(boolean firstDetectionOfTooLongFrame) {
    // Reset the discard state when no more unread bytes need to be discarded
    if (bytesToDiscard == 0) {
        long tooLongFrameLength = this.tooLongFrameLength;
        this.tooLongFrameLength = 0;
        discardingTooLongFrame = false;
        // If fast fail is not set, or if fast fail is set and it is the first time a large package error is detected, throw an exception and let handler handle it
        if (!failFast ||
            failFast && firstDetectionOfTooLongFrame) {
            fail(tooLongFrameLength);
        }
    } else {
        // If fast failure is set and it is the first time a packaging error is detected, throw an exception and let handler handle it
        if(failFast && firstDetectionOfTooLongFrame) { fail(tooLongFrameLength); }}}Copy the code

We can know failFast default is true, but here firstDetectionOfTooLongFrame to true, so, the first detected parcel will throw an exception

Here is the code that throws the exception

private void fail(long frameLength) {
    if (frameLength > 0) {
        throw new TooLongFrameException(
                        "Adjusted frame length exceeds " + maxFrameLength +
                        ":" + frameLength + " - discarded");
    } else {
        throw new TooLongFrameException(
                        "Adjusted frame length exceeds " + maxFrameLength +
                        " - discarding"); }}Copy the code

Discard mode processing

If the reader is to the source, while reading this article, you’ll find LengthFieldBasedFrameDecoder. The entrance of the decoder function and a piece of code in front of us in the analysis of the omitted by me, on the purpose of this section is to undertake the previous section, Easier to read discard mode processing

if (discardingTooLongFrame) {
    long bytesToDiscard = this.bytesToDiscard;
    int localBytesToDiscard = (int) Math.min(bytesToDiscard, in.readableBytes());
    in.skipBytes(localBytesToDiscard);
    bytesToDiscard -= localBytesToDiscard;
    this.bytesToDiscard = bytesToDiscard;

    failIfNecessary(false);
}
Copy the code

As shown above, if you are in discard mode, calculate how many bytes you need to discard, take the minimum number of bytes you still need to discard and read, and then go to failIfNecessary, which by default does not continue to throw exceptions. If failFast is set to false, After that, the exception will be thrown, and the reader can analyze it

Skip the specified length of bytes

After the discarding mode processing and length verification have passed, it is time to skip the specified byte length

int frameLengthInt = (int) frameLength;
if (in.readableBytes() < frameLengthInt) {
    return null;
}

if (initialBytesToStrip > frameLengthInt) {
    in.skipBytes(frameLengthInt);
    throw new CorruptedFrameException(
            "Adjusted frame length (" + frameLength + ") is less " +
            "than initialBytesToStrip: " + initialBytesToStrip);
}
in.skipBytes(initialBytesToStrip);
Copy the code

Check whether enough bytes are read. If yes, some bytes should be skipped according to Initialbytest. see the beginning of this article. Otherwise an exception of CorruptedFrameException is thrown

Sampling frame

int readerIndex = in.readerIndex();
int actualFrameLength = frameLengthInt - initialBytesToStrip;
ByteBuf frame = extractFrame(ctx, in, readerIndex, actualFrameLength);
in.readerIndex(readerIndex + actualFrameLength);

return frame;
Copy the code

At the end of the data packet extraction is actually very simple, get the current cumulative data read pointer, and then get the actual length of the data packet to be extracted for extraction, after extraction, move the read pointer

protected ByteBuf extractFrame(ChannelHandlerContext ctx, ByteBuf buffer, int index, int length) {
    return buffer.retainedSlice(index, length);
}
Copy the code

The extraction process is a simple call to ByteBuf’s retainedSliceapi, which has no memory copy overhead

From the actual extraction of the packet, the argument passed in is of type int, so you can tell that in custom protocols, if your length field is 8 bytes, then the first four bytes are basically useless.

conclusion

1. If you use the netty and binary protocol is based on the length, consider using LengthFieldBasedFrameDecoder, by adjusting the parameters, Will meet the needs of your 2. LengthFieldBasedFrameDecoder unpacking including legal parameters calibration, abnormal packet processing, and final call ByteBuf retainedSlice to realize no memory copy unpacking

If you want to learn Netty in a systematic way, my little book Netty In Action: Creating a Wechat IM Instant Messaging System can help you

If you want to systematically learn the principles of Netty, then you must not miss my Netty source code analysis series video: Java read source code in-depth Analysis of Netty