preface

Due to changes in demand, we need to migrate the logs sent to Aliyun to our internal log platform, which uses the Protocol Buffers Protocol.

Since all the previous reports were reported on Aliyun, logtail was used to record the IP requested by users automatically.

The RESTful interface provided by the internal log platform is only used for forwarding. That is, the front-end uses PB encoding -> sends the RESTful server interface -> forwards the data platform through UDP

In other words, the RESTful relay server must automatically record the IP address of the user request, but the problem is that because the RESTful relay server is generic, it is not responsible for json -> PB, but must be done by the front-end, and because the front-end has been coded, The relay server could not modify the encoded log.

This makes it very difficult for the relay server to add an IP address to us.

Therefore, after thinking, the following three schemes are obtained:

  1. The front end requests third-party resources to obtainIPaddress
  2. Encoded by the relay server
  3. The transfer server modifies the PB message

The first scheme is filtered first, because CDN and other issues need to be considered, and there are many uncontrollable factors

The second solution is not feasible, because as mentioned above, this is a universal service. If so, all PB will be encoded in the transfer server in the future, which will be relatively inefficient

Then you have to choose the third option

Refer to instructions

Before we can formally resolve this, we need to make sure that the IP address must be a string, so we now need to know how Protocol Buffers are encoded by string message.

Here I will refer to the third party’s description: Protocol Buffer encoding principle – string

To make it easier to read, I have posted a screenshot of the instructions:

Start processing

Now that we know how it works, we can modify the code

Now let’s look at our original JSON format:

{
  "level": "info"."message": "test"."lts": 1622078077630."clientIP": "__inject-ip__"
}
Copy the code

Protocol Buffers format:

syntax = "proto3"; message Log { int64 lts = 1; // Timestamp string level = 2; // Log level string message = 3; // Log body information string clientIP = 4; // IP }Copy the code

__injection-ip__ is used as a placeholder to tell the relay server where to change.

After Protocol Buffers are encoded, the above JSON object will now be encoded as:

08 be f5 8c db 9a 2f 22 04 69 6e 66 6f 2a 04 74 65 73 74 3a 0d 5f 5f 69 6e 6a 65 63 74 2d 69 70 5f 5f
Copy the code

In the command, 5F 5F 69 6E 6a 65 63 74 2D 69 70 5F 5f is the hexadecimal code of __inject-ip__

According to the passage, 3a 0d is a necessary information for Protocol Buffers

  1. 3aRepresents the type and ID of the current field
  2. 0dRepresents the length of the current value

So we just care about 0D and 5F, 5F, 69, 6E, 6a, 65, 63, 74, 2D, 69, 70, 5F, 5F

Now let’s assume that the transit server gets the user IP: 127.0.0.1

So we should change 0d to hexadecimal(len(127.0.0.1)) and change 5f 5f 69 6E 6a 65 63 74 2D 69 70 5f 5f to hexadecimal(127.0.0.1)

Now that the principles and details are clear, we can write code to do this:

package main

import (
	"encoding/base64"
	"encoding/hex"
	"fmt"
	"strings"
)

func main(a) {
	// Base64 encoding representation of pb Buffer (byte => base64)
	payloadBase64 := "CMK+w7XCjMObwpovIgRpbmZvKgR0ZXN0Og1fX2luamVjdC1pcF9f"

	/ / decoding base64
	payload, err := base64.StdEncoding.DecodeString(payloadBase64); iferr ! =nil {
		panic(err)
	}

	// String to hexadecimal
	// 08C2BEC3B5C28CC39BC29A2F2204696E666F2A04746573743A0D5F5F696E6A6563742D69705F5F
	payloadBinaryStr := fmt.Sprintf("%X", payload)

	// This is the __inject-ip__ placeholder in hexadecimal
	ipPlaceholder := "5F5F696E6A6563742D69705F5F"

	// The subscript of the occupier
	placeholderIndex := strings.Index(payloadBinaryStr, ipPlaceholder)

	// The user's IP address
	clientIP := "127.0.0.1"

	// clientIP hexadecimal
	clientIPBinaryStr := fmt.Sprintf("%X", clientIP)

	// IP length (hexadecimal format), used to change the clientIP length in pb
	Because the maximum length of an IP address is 15(255.255.255.255) and cannot exceed 255, we can ensure that the length of a clientIPLen is 2 bits (that is, one byte).
    // If there are no more than 2 bits, then zero will be added to the front
	clientIPLen := fmt.Sprintf("%02X".len(clientIP))

	payloadBinaryStrPrefix := payloadBinaryStr[:placeholderIndex - 2]
	payloadBinaryStrSuffix := payloadBinaryStr[placeholderIndex + len(ipPlaceholder):]
	payloadBinaryStrNewContent := clientIPLen + clientIPBinaryStr

	// 08C2BEC3B5C28CC39BC29A2F2204696E666F2A04746573743A093132372E302E302E31
	// You can see that the previous 0D has been replaced with 09
	payloadBinaryStr = payloadBinaryStrPrefix + payloadBinaryStrNewContent + payloadBinaryStrSuffix

	payloadBinaryByte, err := hex.DecodeString(payloadBinaryStr); iferr ! =nil {
		panic(err)
	}

	newPayloadBase64 := base64.StdEncoding.EncodeToString(payloadBinaryByte)

    // CMK+w7XCjMObwpovIgRpbmZvKgR0ZXN0OgkxMjcuMC4wLjE=
	fmt.Println(newPayloadBase64)
}
Copy the code

Then we modify the original JSON, change __inject-ip__ to 127.0.0.1, and use Protocol Buffers to encode the same, indicating that there is no problem.