The first part, Protobuf usage characteristics
1. Conversion of objects of different types
Let’s define the following.proto file
syntax = "proto3";
option java_package = "cn.tera.protobuf.model";
option java_outer_classname = "DifferentModels";
message Person {
string name = 1;
int32 id = 2;
string email = 3;
}
message Article {
string title = 1;
int32 wordsCount = 2;
string author = 3;
}
Copy the code
We have defined two models, one Person and one Article. Although their field names are different, their types and numbers are the same
Next, we generate a.java file. The final file structure is shown below
At this point we try to do one of the following transformations
/** * Tests conversions between different models * @throws Exception */ @test public void parseDifferentModelsTest() throws Exception {// Create a Person object DifferentModels.Person person = DifferentModels.Person.newBuilder() .setName("person name") .setId(1) .setEmail("[email protected]") .build(); Byte [] personBytes = person.tobytearray (); / / the encoded data directly merge into the Article object DifferentModels. The Article Article = DifferentModels. Article. ParseFrom (personBytes); System.out.println("article's title:" + article.getTitle()); System.out.println("article's wordsCount:" + article.getWordsCount()); System.out.println("article's author:" + article.getAuthor()); }Copy the code
The output is as follows
article's title:person name
article's wordsCount:1
article's author:[email protected]
Copy the code
As you can see, although the jsonBytes are encoded by the Person object, they can be decoded by the Article object with no errors and all data content intact
This compatibility assumes that the field types and ordinals defined in the model are identical in one to one
In normal coding, it is common to read the data model from the database and convert it to the business model. In many cases, the content of the two models is exactly the same. In this case, we might use this feature of Protobuf and save a lot of inefficient assignment code
2. Importance of a protobuf number
When defining a.proto file, the field will be followed by a “= X”, which does not refer to the value of the field, but to the “sequence number” of the field, which is closely related to the correct encoding and decoding. In my opinion, it is the soul of the Protocol buffer
We define the following.proto file, where notice that the name and id of Model1 and Model2 are different
syntax = "proto3";
option java_package = "cn.tera.protobuf.model";
option java_outer_classname = "TagImportance";
message Model1 {
string name = 1;
int32 id = 2;
string email = 3;
}
message Model2 {
string name = 2;
int32 id = 1;
string email = 3;
}
Copy the code
Define the following test methods
/** * Sequence number importance Test ** @throws Exception */ @test public void tagImportanceTest() throws Exception {tagImportanceTest model1 = TagImportance.Model1.newBuilder() .setEmail("[email protected]") .setId(1) .setName("model1") .build(); TagImportance.Model2 model2 = TagImportance.Model2.parseFrom(model1.toByteArray()); System.out.println("model2 email:" + model2.getEmail()); System.out.println("model2 id:" + model2.getId()); System.out.println("model2 name:" + model2.getName()); System. The out. Println (" -- -- -- -- -- -- -- model2 data -- -- -- -- -- -- -- -- -- "); System.out.println(model2); }Copy the code
The output is as follows
Model2 email:[email protected] model2 ID :0 Model2 name: ------- Model2 data --------- email: "[email protected]" 1: "model1" 2: 1Copy the code
It can be seen that although the field types and names defined by Model1 and Model2 are the same, the Ordinal Numbers of name and ID are reversed, resulting in the failure of Model2 to correctly parse data into the corresponding field when parsing byte array, so the output ID is 0, while the name field is null
When model2.tostring () is output, we can still see that the data is parsed, but can not correspond to the specific field, only 1,2 can represent the field name 3. Protobuf sequence number on the encoding result size
The ordinal number of a protobuf not only affects the correctness of encoding and decoding, but also affects the number of bytes encoded to a certain extent
We add a Model3 to the.proto file above, where the fields defined in Model3 are unchanged, but the ordinals are changed to 16,17,18
syntax = "proto3";
option java_package = "cn.tera.protobuf.model";
option java_outer_classname = "TagImportance";
message Model1 {
string name = 1;
int32 id = 2;
string email = 3;
}
message Model2 {
string name = 2;
int32 id = 1;
string email = 3;
}
message Model3 {
string name = 16;
int32 id = 17;
string email = 18;
}
Copy the code
The test method
** @throws Exception */ @test public void tagSizeInfluenceTest() throws Exception { TagImportance.Model1 model1 = TagImportance.Model1.newBuilder() .setEmail("[email protected]") .setId(1) .setName("model1") .build(); System.out.println("model1 encoding size: "+ model1.tobyteArray ().length); TagImportance.Model3 model3 = TagImportance.Model3.newBuilder() .setEmail("[email protected]") .setId(1) .setName("model1") .build(); System.out.println("model3 encoding size: "+ model3.tobyteArray ().length); }Copy the code
The output is as follows
Model1 code size: 29 Model3 code size: 32Copy the code
It can be seen that when the amount of data is exactly the same, the encoding result of the object with a larger number will also be larger
4. Compatibility of model field data types
I mentioned flexibility earlier in the getName() method, which I’ll show you next
We define the following.proto file
syntax = "proto3";
option java_package = "cn.tera.protobuf.model";
option java_outer_classname = "ModelTypeCompatible";
message OldPerson {
string name = 1;
int32 id = 2;
string email = 3;
}
message NewPerson {
Name name = 1;
int32 id = 2;
string email = 3;
}
message Name {
string first = 1;
string last = 2;
int32 usedYears = 3;
}
Copy the code
There are two Person objects defined
In OldPerson, name is a pure String
In NewPerson, the Name field is defined as an object
At this point we do the following
/** * Compatibility of different types of model fields ** @throws Exception */ @test public void typeCompatibleTest() throws Exception { ModelTypeCompatible.NewPerson newPerson = ModelTypeCompatible.NewPerson.newBuilder() .setName(ModelTypeCompatible.Name.newBuilder() .setFirst("tera") .setLast("cn") .setUsedYears(10) ).setId(5) .setEmail("[email protected]") .build(); ModelTypeCompatible.OldPerson oldPerson = ModelTypeCompatible.OldPerson.parseFrom(newPerson.toByteArray()); System.out.println(oldPerson.getName()); }Copy the code
The output is as follows
tera cn
Copy the code
As you can see, even though NewPerson’s name field is an object, it can be successfully converted to OldPerson’s String name field, even though the usedYears field is discarded
This compatibility presuppositions converting from object type to String, not the other way around
5. Conversion and comparison between Protobuf and JSON
Json is one of the most widely used data structures today, so when we decide to use Protobuf, the inevitable issue is its compatibility with JSON
So how does protobuf translate to JSON
Let’s start by constructing a simple Java class
public class PersonJson {
public String name;
public int id;
public String email;
}
Copy the code
Re-using the protobuf model basicUsage.person generated earlier, and the JSON-related Maven introduced earlier, we tested the following approach
@test void jsonToProtobuf() throws Exception {// Construct a simple model PersonJson model = new PersonJson(); model.email = "[email protected]"; model.id = 1; model.name = "personJson"; String json = JSON.toJSONString(model); System. The out. Println (" original json "); System.out.println("------------------------"); System.out.println(json); System.out.println(); //parser JsonFormat.Parser parser = JsonFormat.parser(); / / need to build to convert BasicUsage. Person. Builder personBuilder = BasicUsage. Person. NewBuilder (); // Convert json string to protobuf model and print parser.merge(json, personBuilder); BasicUsage.Person person = personBuilder.build(); // Note that protobuf's toString method does not automatically convert to JSON, but is rendered in a simpler way, so you can't use system.out.println (" Protobuf content ") directly; System.out.println("------------------------"); System.out.println(person.toString()); Person = person.toBuilder().setName("protobuf").setid (2).build(); String buftoJson = JsonFormat.printer().print(person); System.out.println("protobuf modified json"); System.out.println("------------------------"); System.out.println(buftoJson); }Copy the code
The output is as follows
The original json -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- {" email ":" [email protected] ", "id" : 1, "name" : "personJson}" protobuf content ------------------------ name: "personJson" id: 1 email: "[email protected]" protobuf modified after the json data -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- {" name ":" protobuf ", "id" : 2, "email" : "[email protected]" }Copy the code
You can see that JSON and Protobuf are fully compatible conversions
At this point, we can compare the size of bytes of compatible data content encoded by JSON and Protobuf respectively. We use the above data content to do the following tests
*/ @test void codeSizeJsonVsProtobuf() throws Exception {// Construct a simple model PersonJson model = new PersonJson(); model.email = "[email protected]"; model.id = 1; model.name = "personJson"; String json = JSON.toJSONString(model); System. The out. Println (" original json "); System.out.println("------------------------"); System.out.println(json); System.out.println("json encoded bytes: "+ json.getBytes(" utF-8 ").length + "\n"); //parser JsonFormat.Parser parser = JsonFormat.parser(); / / need to build to convert BasicUsage. Person. Builder personBuilder = BasicUsage. Person. NewBuilder (); // Convert json string to protobuf model and print parser.merge(json, personBuilder); BasicUsage.Person person = personBuilder.build(); // Note that protobuf's toString method does not automatically convert to JSON, but is rendered in a simpler way, so you can't use system.out.println (" Protobuf content ") directly; System.out.println("------------------------"); System.out.println(person.toString()); System.out.println(" number of bytes protobuf encoded: "+ Person.tobytearray ().length); }Copy the code
The output is as follows
The original json -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- {" email ":" [email protected] ", "id" : 1, "name" : "personJson}" json encoding after the number of bytes: 60 Protobuf content ------------------------ name: "personJson" ID: 1 Email: "[email protected]" Number of encoded bytes for protobuf: 37Copy the code
As you can see, the protobuf encoding is about 60% of the json encoding for the same data content (although this number varies depending on the data content).
Here’s a summary of the previous features
1. Protobuf decoding does not require the same type or field name
2. The decoding of a protobuf depends on the correctness of the sequence number
3. The ordinal size of a protobuf will affect the final code size
4. Protobuf object types are compatible with String types
5. Protobuf is fully compatible with JSON and has a smaller encoding size than JSON