This is the second in a series on game development. If you’re interested, stay tuned for more updates. Other articles are listed below:
Game development — Protocol design
Game development – Protocol – Protobuf
Game development – Protocol – Protobuf in detail
WHAT
Introduction to the
We see the official document is introduced like this:
Protocol buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data.
Protocol Buffers is a cross-language, cross-platform, and extensible format for serializing structured data.
In a nutshell, Protocol Buffers are a Google-defined structured data format for serialization and deserialization of data. Because it operates directly on binary source data, it is small, fast, and simple enough compared to XML, and language – and platform-independent, so it has good compatibility. It is currently ideal for data storage or data transfer between network communications.
Currently, the official display has supported up to 10 development languages, respectively: C++, Java, Python, Objective-C, C#, JavaNano, JavaScript, Ruby, Go, PHP, basically mainstream languages are supported. Of course, there are also unofficial (such as Lua) support languages, specifically is to add a parsing lib, special needs can refer to the official documentation of their own. Currently supported languages are as follows:
Language | Source |
---|---|
C++ (include C++ runtime and protoc) | src |
Java | java |
Python | python |
Objective-C | objectivec |
C# | csharp |
JavaNano | javanano |
JavaScript | js |
Ruby | ruby |
Go | golang/protobuf |
PHP |
Performance:
Officially, it’s powerful enough. How good is it? Let’s look at the performance test comparison.
The above is based on Full Object Graph Serializers, which includes creating objects, serializing them into an in-memory sequence of bytes, and then deserializing them. Figure 1 shows the total time (serialization + deserialization) and Figure 2 shows the compressed size. It can be seen that protocolBuffer has significant advantages in both serialization speed and data size. Specific test data point here.
HOW
The official Guide has given a detailed introduction on how to use the package. We will decompose the package based on the official demo to understand its serialization process and soruCE structure, so as to have a general understanding of the whole mechanism (the following language is based on Java).
demo
The demo assumes that you already have the platform’s compiler(.proto generates the target language’s code). If you don’t, please refer to the C++ runtime and protoc runtime website for compilation. For Windows, you can also download one here.
Step1: introducing the maven
< the dependency > < groupId > com. Google. Protobuf < / groupId > < artifactId > protobuf - Java < / artifactId > < version > 3.2.0 < / version > </dependency>Copy the code
Step2: define.proto files
syntax = "proto3";
package msg;
option java_package = "com.example.msg";
option java_outer_classname = "LoginMsg";
message Login {
string useranme = 1;
int32 pw=2;
}Copy the code
Supported data types:
In crossing the Internet bar
Step3 :compiler produces the code
/ / - java_out is target language code directory Is followed by a space after. Proto file directory, generate multiple available -i protoc -- java_out = Java resources/protoc/login. ProtoCopy the code
Final generated files and directories:
Reader&Writer
Java (loginmsg. Java) has been integrated with serialization and deserialization of LoginMsg. We only need to operate the reader and writer of the login message through this class. So if you want to write a loginMsg to the stream and send it out, all you have to do is assign a value to the loginMsg and then the writer, the object is serialized to binary data write, or when the receiver reads the loginMsg, By calling its ParserbyReader, you can deserialize the LoginMsg object based on the binary stream.
Write:
Public void the write () throws the Exception {/ / build the Login message object LoginMsg. Login. The Builder Builder = LoginMsg. Login. NewBuilder (); builder.setUseranme("wier"); builder.setPwd(111); FileOutputStream output = new FileOutputStream("/Users/wier/login_msg");
builder.build().writeTo(output);
output.close();
}Copy the code
Read
public void read() throws Exception{
FileInputStream inputStream = new FileInputStream("/Users/wier/login_msg");
LoginMsg.Login login = LoginMsg.Login.parseFrom(inputStream);
System.out.print("login.username:"+login.getUseranme());
System.out.print("login.pwd:"+login.getPwd());
}Copy the code
As you can see, the above code is very simple to read and write messages. You just need to convert the above stream into a socket to transfer messages over TCP.
The Message class structure
Let’s look at the main information of the entire message object based on LoginMsg.
A Message class mainly contains the following information:
The body of the Login message structure object, which mainly stores data, inherits GeneratedMessageV3, and internally encapsulates the serialization and deserialization of the object, writeTo serialization, and paser deserialization.
LoginOrBuilder connects Login and Builder, provides type information, and provides field get methods externally.
Builder message object Builder that encapsulates field set methods.
Descriptor message object metadata, usually not used, if you need dynamic parsing can be handled by this
Parser provides service for message reverse sequence numbers
Let’s look at the hierarchy of class
MessageLite/Message interface is the abstract interface of all Messages. Message can build objects from byte stream data based on Parser, or write byte stream data to IO pipe after serializing objects created by Builder. MessageLite and Message define their own Builder classes internally, inheriting from MessageLiteOrBuilder and MessageOrBuiler, and defining the common interfaces of MessageLite/Message and their respective Builder classes.
Calling sequence
write
In the above write process, we can see that data encapsulation is mainly handled by build, GeneratedMessageV3 encapsulates some basic field reading operations, and the final field writing is mainly carried out by CodedOutputStream. CodedOutputStream encapsulates all fields into binary, such as int,String, etc., which you pass in based on the defined field. OutputStreamEncoder is CodedOutputStream is a subclass.
read
The read process is also a process of unpacking. Parser is mainly used for parsing management, such as binary data or IO based parsing, or some extension fields call the pre-registered ExtensionRegister to define their own parsing. The final field read is read by calling CodedInputStream, which, like CodedOutputStream above, is read based on defined fields, converting binary data to the specified field type. Message constructors are read based on CodedInputStream, and the read order is based on tag. What does the tag of each field do?
Message binary structure
When a message field is read, it is called readTag or writeTag first. When a message field is read, it is called writeTag.
A binary stream consists of an ordered array of byte data. Each field in the diagram above consists of a tag and a value. The tag is the description or definition of the value information, and tells the parser what type of field the fields are and the order in which they are read. The parser knows where a field starts and ends in the stream, so that a field decodes successfully, regardless of the order of the fields.
Tag structure:
(fieldNumber << 3) | wireType;
The reason for needing a fieldNumber is that it tells the parser the order in which fields are parsed in the byte stream. It also allows you to extend the protocol. For example, if you need to add or change a field to a protocol message you are already using, you can fieldNumber+1, so that even the same message, Whether the client updates the protocol (for example, old message is still used) does not affect server parsing. This mechanism ensures that even if new fields are added to the message, the old codec will not be affected.
Descriptor
Descriptor is the metadata description of the message object. When compilerss generates the message object class, A static field is defined for each message Descriptor, and a FieldAccessorTable static field is defined to read/set the value of a field using reflection.
Of course, this is not necessary for normal serialization and deserialization, because the parse order and type of the message are already generated based on the configuration file at generation time, and there is no need to parse the label meaning.
If you need dynamic parsing, such as adding or updating a Message, no more code is needed, restart the process, automatically create specific Protobuf Message objects based on received data and configuration files, and then do deserialization. Now, Descriptor is very helpful to you. So let’s look at the class layer structure in Descriptor.
The last
extensions
In Protocol2, extensions field definitions are also supported. Extend is used to resolve message reuse, which is deprecated in Protocol3 and is supported in Any.
Unknown Fields
During Protocol2, if a field cannot be resolved (for example, after a message is upgraded, the client sends an old message), the default protocol is as follows:
default:
if(! parseUnknownField(input, unknownFields, extensionRegistry, tag)) {done = true;
}Copy the code
Copy the code
Protocol3 has now updated this protocol to skipField if an undefined field is encountered.
default:
if(! input.skipField(tag)) {done = true;
}
break;Copy the code
This section only introduces what the protocol buffer is and how to use it, and does not sort out why the protocol has advantages such as small space, fast parsing speed and compatibility. If you are interested in this part, please pay attention to the next related text, and I will try to sort out the why question.
—————————————————end—————————————————
Scanning pays more attention to personal growth and technical learning, looking forward to bringing you some inspiration and comprehension with a little change of my own.