“Ziefeng Chen” wechat technology number has now been opened, in order to get the first-hand technical article push, welcome to search attention!
preface
Compared with the traditional Internet in the PC era, mobile Internet benefits from the portability of mobile devices and has rapidly penetrated into all aspects of people’s life and work in just a few years. Although communication technology and hardware equipment are constantly updated and upgraded, electricity and flow are still scarce resources for mobile devices at present.
Developers who have participated in the adaptation work of Android system version upgrade may clearly feel that every updated version of Android system in recent years is closer to the direction of more power saving, more traffic saving and more memory saving, such as:
- Android 6.0 introduces low power consumption mode and application standby mode
- Android 7.0 introduces low power consumption anytime, anywhere mode
- Android 8.0 introduced background execution restrictions
- Android 9.0 introduces the application standby storage partition
.
Requests from mobile applications to the network are one of the main sources of power consumption. In addition to sending and receiving the packets themselves, turning on the wireless device and keeping it awake also consumes extra power. Especially for the application scenarios with frequent network interactions such as instant messaging, the size of data transmission must be considered as an aspect of optimization. It is necessary to reduce redundant data as far as possible, improve transmission efficiency, and thus reduce the loss of power and flow.
Compared with more readable text data, binary data has less data redundancy, more compact data arrangement, and therefore smaller size and faster transmission. However, using custom binary protocols means that you need to define your own data structure, do your own serialization and deserialization, and version compatibility is also an issue. Considering the cost of time and technology, we decided to use Protobuf to help us complete this part of the work.
Protobuf is what?
Protobuf (Protocol Buffer) is a cross-language, cross-platform, extensible structured data serialization mechanism developed by Google. Protocol is lighter, faster, and simpler than XML, JSON, and other data transfer formats. We only need to define the data structure in the.proto file, we can use the Protobuf compiler to compile and generate data access class code for various platforms and languages, easily write and read structured data in various data flows, especially for data storage and network communication scenarios.
To sum up:
Advantages:
- Data size: the unique Varint, Zigzag coding and T-L-V data storage to achieve data compression
- Parsing efficiency: Automatic encoding and parsing of data in efficient binary format
- Versatility: cross-language, cross-platform
- Ease of use: Data access classes can be generated automatically using the Protobuf compiler
- Extensibility: Format can be extended iteratively with version
- Compatibility: Backward compatibility with data encoded in older formats
- Maintainability: Multiple platforms only need to jointly maintain a. Proto file
Disadvantages:
Poor readability: Difficulty understanding data structures in the absence of.proto files
Since it is the selection of data transmission format, it is inevitable to compare with other data transmission formats. The data transmission formats we commonly interact with the server side are XML and JSON.
-
XML
Extensible Markup Language, is a text-type data format that begins with “<” and ends with “>” tags as the main syntax rules. XML was designed to be described as a document, but is also widely used to represent arbitrary data structures.
Advantages:
- Read well
- Good scalability
Disadvantages:
- Parsing is expensive, and encoding/decoding it can result in a significant performance penalty for your application
- Large space footprint, low effective data transfer rate (lots of tags)
If you’re an Android developer, you’re probably familiar with Android’s lightweight persistence solution, SharedPreference, which uses XML as its main implementation, However, Android has officially suggested using DataStore as an alternative to SharedPreference. DataStore uses ProtoBuf as the main implementation.
- JSON
The JavaScript Object Notation (JavaScript Object Notation) is an open standard file format and data interchange format that stores and transfers data objects consisting of attribute value pairs and arrays in text form. It is commonly used for communicating with servers.
Advantages:
In addition to having the same advantages as XML, the effective data volume transfer rate is higher and the bandwidth consumed can be saved because the strict closing tags are not required as XML.
ProtoBuf implementation
Add ProtoBuf dependencies in Gradle form
- Project level build.gradle file:
dependencies { ... / / Protobuf classpath 'com. Google. Protobuf: Protobuf - gradle - plugin: 0.8.8'}Copy the code
- Build. gradle file at the module level:
apply plugin: Protobuf 'android {sourceSets {main {// define proto directory proto {srcDir' SRC /main/proto'}}}} dependencies {def PROTOBUF_VERSION = "3.0.0" API "com. Google. Protobuf: protobuf - Java: ${PROTOBUF_VERSION}" API "Com. Google. Protobuf: protoc:" ${PROTOBUF_VERSION}} protobuf {protoc {an artifact = 'com. Google. Protobuf: protoc: 3.2.0'} Plugins {javalite {an artifact = 'com. Google. Protobuf: protoc - gen - javalite: 3.0.0'}} generateProtoTasks {all (). Each { task -> task.plugins { javalite {} } } } }Copy the code
Define the data structure for the message to be stored in the PROto file
First, we need to create a new message_to. proto file under {module}/ SRC /main/proto to define the data structure of the object we want to store, as follows:
Before we define the data structure, let’s think about what fields a basic INSTANT messaging message should contain. Here is an example of sending and receiving letters in daily life:
The content of the letter is what we care most about — content
Who sent me the letter, was it for me or for someone else? – sender_id, target_id
To quickly retrieve letters, we also need a unique value — message_id
What kind of letters are they? Is it a credit card statement or a love letter? – the type
If there are multiple letters, we also need to clarify the time line of the letter — timestamp in order to read it smoothly
The following is the final definition of message_to.proto. Let’s go through the file step by step:
syntax = "proto3"; option java_package = "com.madchan.imsdk.lib.objects.bean.dto"; option java_outer_classname = "MessageDTO"; message Message { enum MessageType { MESSAGE_TYPE_UNSPECIFIED = 0; // MESSAGE_TYPE_TEXT = 1; // Text message} // Message unique value uint64 Message_id = 1; MessageType Message_type = 2; String sender_id = 3; // Message target user string target_id = 4; // Message timestamp, uint64 TIMESTAMP = 5; Bytes content = 6; }Copy the code
Declarative usage syntax
syntax = "proto3";
Copy the code
The first line of the file indicates that we are using proto3 syntax. By default, the ProtoBuf compiler will assume that we are using Proto2. The declaration must be on the first line and not empty or uncommented.
Specify file options
option java_package = "com.madchan.imsdk.lib.objects.bean.dto";
Copy the code
Java_package is used to specify the package directory path for the Java classes we want to generate.
option java_outer_classname = "MessageDTO";
Copy the code
Java_outer_classname specifies the classname of the Java wrapper class we want to generate. If this is not specified by default, the.proto file name is converted to camel shape.
There is also a javA_multiple_files option that, when true, converts multiple data structures declared in a.proto file into multiple separate.java files. When the default is false, only one. Java file is generated as an inner class.
Specify the field type
// Message unique value uint64 Message_id = 1;Copy the code
As you may have noticed, message_id and timestamp are expressed in sectors. This is an unsigned int, indicating a positive integer of type Long.
Numbers in computers are represented in binary, and the leftmost bit is usually used to indicate whether the number is positive or negative, in which case it is a signed integer. If the leftmost digit doesn’t have to be positive or negative, but is joined to an integer, then you can’t tell whether the number is positive or negative, you can only have a positive number, and that’s an unsigned integer.
enum MessageType { MESSAGE_TYPE_UNSPECIFIED = 0; // MESSAGE_TYPE_TEXT = 1; MessageType Message_type = 2;Copy the code
When describing the message type, because the value of the message type is usually only in a predefined range, which conforms to the enumeration feature, we use enumeration to achieve. Here we simply define an unknown type and a text message type.
Note that each enumeration definition must include a constant mapped to zero as its first element to be the default.
For additional data types, please refer to this table, which shows the data types supported in.proto files, as well as the corresponding data types in automatically generated classes for the corresponding language.
Developers.google.com/protocol-bu…
Assigned field number
You may wonder what the number at the end of each field means. These are actually unique numbers for each field, used to uniquely identify our field in the message binary format, and should not be changed once they are used.
If we want to remove a field during a release iteration, we need to make sure that we don’t reuse the field number, which could cause serious problems such as data corruption. To make sure this doesn’t happen, we need to reserve the field number or name of the deleted field with the reserved flag. If you try to use these fields later, the ProtoBuf compiler will report an error like this:
message Message {
reserved 3, 4 to 6;
reserved "sender_id ", "target_id ";
}
Copy the code
Another thing to know is that ProtoBuf field numbers in the range 1 to 15 are encoded in one byte (including field number and field type), whereas field numbers in the range 16 to 2047 are encoded in two bytes. Based on this feature, we need to number fields in the range 1 to 15 for fields that occur frequently (that is, required fields), and 16 to 2047 for optional fields.
Add comments
We can also add comments to proto files to support // and /*… */ Syntax, comments will also be retained in automatically generated classes for the corresponding language.
Automatically generate a Java class using the ProtoBuf compiler
Once everything is in place, we can rebuild the project directly, and the ProtoBuf compiler will automatically refer to the message defined in the.proto file, In {module} / build/generated/source/proto/debug/javalite directory to generate the corresponding Java class files in the path of the package name, just after the class file copy to SRC/main/Java directory, Gradle Task can do this for us:
/ / whether to allow Proto generated DTO class def enableGenerateProto = true / / def enableGenerateProto = false project. The tasks. WhenTaskAdded {Task task -> if (task.name == 'generateDebugProto') { task.enabled = enableGenerateProto if(task.enabled) { task.doLast { // Copy the Build directory DTO classes to Src directory copy {the from 'Build/generated/source/proto/debug/javalite' into 'Src/main/Java} / / Delete the Build directory DTO classes FileTree tree = FileTree (" Build/generated/source/proto/debug/javalite ") tree. Each {file - > delete the file} }}}}Copy the code
Reading the automatically generated messagedto.java file, you can see that the Protobuf compiler generates a Java class for each defined data structure, provides sette() R and getter() methods for accessing each field in the class, and provides a Builder class for creating instances of the class.
Write and read messages using the Java language-based ProtoBuf API
We’ll start by synchronizing the previously defined message data structure to Messagevo.kt, keeping the fields of the two entity classes consistent. Why we do this, rather than sharing a messagedto.java directly, will be explained in the next article.
data class MessageVo( var messageId: Long, var messageType: Int, var sendId: String, var targetId: String, var timestamp: Long, var content: String ) : Parcelable { constructor(parcel: Parcel) : this( parcel.readLong(), parcel.readInt(), parcel.readString() ? : "", parcel.readString() ? : "", parcel.readLong(), parcel.readString() ? : "" ) { } override fun writeToParcel(parcel: Parcel, flags: Int) { parcel.writeLong(messageId) parcel.writeInt(messageType) parcel.writeString(sendId) parcel.writeString(targetId) parcel.writeLong(timestamp) parcel.writeString(content) } override fun describeContents(): Int { return 0 } companion object CREATOR : Parcelable.Creator<MessageVo> { override fun createFromParcel(parcel: Parcel): MessageVo { return MessageVo(parcel) } override fun newArray(size: Int): Array<MessageVo? > { return arrayOfNulls(size) } }Copy the code
Now, here are two things we need to do:
- The MessageVO object of the future self layer is converted to the MessageDTO object of the data transport layer and serialized to the binary data format for message sending.
- Receives messages in binary data format, deserializes them into MessageDTO objects, and converts MessageDTO objects from the data transport layer into MessageVO objects from the view layer.
We encapsulate this work into the enveloping helper class:
Class EnvelopeHelper {companion object {/** * fill operations (VO->DTO) * @param envelope class, which contains message view objects */ Fun Stuff (envelope: Envelope): MessageDTO.Message? { envelope? .messageVo? .apply { return MessageDTO.Message.newBuilder() .setMessageId(messageId) .setMessageType(MessageDTO.Message.MessageType.forNumber(messageType)) .setSenderId(sendId) .setTargetId(targetId) .settimestamp (timestamp).setContent(byteString.copyfromutf8 (content)).build()} return null} /** * Extract operation (DTO->VO) * @param messageDTO extract(messageDTO: messagedto.message): Envelope? { messageDTO? .apply { val envelope = Envelope() val messageVo = MessageVo( messageId = messageId, messageType = messageType.number, sendId = senderId, targetId = targetId, timestamp = timestamp, content = String(content.toByteArray()) ) envelope.messageVo = messageVo return envelope } return null } } }Copy the code
The serialization and deserialization of message transmission can be completed by calling the following two key nodes of message sending and receiving:
MessageAccessService.kt:
/* private val MessageCarrier: IBinder = object: MessageCarrier.Stub() { override fun sendMessage(envelope: Envelope) { Log.d(TAG, "Send a message: " + envelope.messageVo? .content) val messageDTO = EnvelopeHelper.stuff(envelope) messageDTO? .let { WebSocketConnection.send(ByteString.of(*it.toByteArray())) } ... }... }Copy the code
WebSocketConnection.kt:
* @param webSocket * @param bytes */ Override fun onMessage(webSocket: webSocket, bytes: ByteString) { super.onMessage(webSocket, bytes) ... val messageDTO = MessageDTO.Message.parseFrom(bytes.toByteArray()) val envelope = EnvelopeHelper.extract(messageDTO) Log.d(MessageAccessService.TAG, "Received a message : " + envelope? .messageVo? .content) ... }Copy the code
The next chapter previews
In the above article we left a question: why split MessageVO and MessageDTO into two entity objects? This actually relates to the problem of domain-driven Design (DDD), which is designed to achieve decoupling after hierarchical structure, requiring the use of different data models at different levels.
However, using get/set to map fields one by one is too tedious and error-prone, so in the next article we will introduce the MapStruct library to help simplify this part of the work in an automated way.
“Ziefeng Chen” wechat technology number has now been opened, in order to get the first-hand technical article push, welcome to search attention!
reference
Protocol Buffers website developers.google.com/protocol-bu…
Protocol Buffers basis: Java developers.google.com/protocol-bu…
Protocol Buffers wikipedia en.wikipedia.org/wiki/Protoc…
How to choose the data transfer format of instant messaging application www.52im.net/thread-276-…
Strong columns suggest using Protobuf as data transfer format for your instant messaging app www.52im.net/forum.php?m…
Protobuf communication protocol details: code demonstration, detailed principle introduction www.52im.net/forum.php?m…
Theory and Practice: a set of typical IM communication protocol design details www.52im.net/thread-283-…
Android serialization: A hand-by-hand analysis of the Protocol Buffer using source blog.csdn.net/carson_ho/a…