Welcome to the developer’s guide to Protocol Buffers. Protocol Buffers is a language-neutral, platform-neutral extension for serializing structured data in communication protocols, data storage, and other areas.

This document is intended for Java, C++, or Python developers who want to use Protocol Buffers in the applications they develop. This summary introduction to Protocol Buffers will show you how to get started using Protocol Buffers. If you want to learn more about Protocol Buffers, you can go to tutorials or the Protocol Buffer Encoding page to learn more.

For reference documentation on the API, see the page: Reference Documentation for all three languages, as well as guidance for.Proto Language and style.

What is Protocol Buffers

Protocol Buffers is a flexible, efficient, automated tool for serializing structured data — you can think of Protocol Buffers as XML, but smaller, faster, and simpler.

You can define your own structured data, and then you can use specific code generation tools to make it very easy to read and write your structured data. These data can be read and written as a series of data streams and using different computer programming languages. You can even update your data structures without damaging the already deployed programs.

How does Protocol Buffers work

You need to specify how you want your data to be serialized. You define your message structured data through proto files.

Each protocol buffer message is a small log of messages containing a sequence of names and variables.

Here are some basic.proto files that define a message containing a Person message:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phone = 4;
}
Copy the code

You can see from the above that the format of the message is very simple — each message type has one or more uniquely numbered fields, each containing a name and variable type.

Variables can be numbers (integer or floating point) (numbers), Booleans (Strings), native binary (raw bytes), or any other protocol buffer message type that allows you to hierarchically structure your data.

You can specify fields as optional fields, required fields, and repeated fields. You can find more definitions of.proto in the Protocol Buffer Language Guide page below.

Once you have successfully defined your message, you can use your.proto to run the Protocol Buffer compiler to generate data access classes for the language you are using.

For each field, simple access methods (such as name() and set_name()) and methods to serialize to and from native binary data are provided in the data access class.

For the above definition, if you are currently using C++, when you compile the message definition, you will get a class called Person. Serialize the data and retrieve the Person data from the serialized data (Protocol Buffer messages).

And then you can write something like Person Person; The code.

Person person;
person.set_name("John Doe");
person.set_id(1234);
person.set_email("[email protected]");
fstream output("myfile", ios::out | ios::binary);
person.SerializeToOstream(&output);
Copy the code

You can then read the message:

fstream input("myfile", ios::in | ios::binary);
Person person;
person.ParseFromIstream(&input);
cout << "Name: " << person.name() << endl;
cout << "E-mail: " << person.email() << endl;
Copy the code

You can add new fields to your message without corrupting the old message. This is because in old message processing, targeting new fields was completely ignored. Therefore, if you use protocol buffers as data structures in your communication protocols, you can extend your protocols and messages without worrying that older code will not compile or corrupt.

You can visit the API Reference Section page to see how the complete Protocol Buffer code is generated and used.

You can also learn more about how Protocol Buffer messages are encoded on the Protocol Buffer Encoding page.

Why not use XML

Protocol Buffers have additional advantages for SERIalizing structured data for XML.

  • More simple

  • Less than 3 to 10 times XML

  • 20 to 100 times faster than XML

  • Loose coupling

  • Use program tools to create data access classes to make data access classes easier

  • Suppose you need to define the person data. In XML you need to use:

    John Doe [email protected]

To define it.

In Protocol Buffers, text format for the above message will show:

# Textual representation of a protocol buffer.
# This is *not* the binary format used on the wire.
person {
  name: "John Doe"
  email: "[email protected]"
}
Copy the code

When the message above is encoded in Protocol Buffer binary format the text above may be less than 28 bytes and may take 100-200 nanoseconds to process.

We converted the above to be human-readable mainly for debugging and editing purposes.

If you use XML, the above information takes at least 69 bytes (you need to remove all the Spaces), and you need 5,000-10,000 nanoseconds to process it.

The Protocol buffer is also very easy to manipulate:

cout << "Name: " << person.name() << endl;
cout << "E-mail: " << person.email() << endl;
Copy the code

If you are using XML, you need to do the following:

cout << "Name: "
     << person.getElementsByTagName("name")->item(0)->innerText()
     << endl;
cout << "E-mail: "
     << person.getElementsByTagName("email")->item(0)->innerText()
     << endl;
Copy the code

However, Protocol Buffers are not always better than XML. For example, protocol buffers are not a good option for text-based markup languages (e.g., XML) because you cannot use protocol Buffers to better exchange in documents. More importantly, HTML is human-readable and editable. The Protocol buffer cannot be read manually, but there is no way to read or edit the native Protocol buffer format.

XML, like HTML, is to some extent self-describing data. The protocol buffer only expresses what you describe in the.proto file.

Looks like a solution. How do I start

Download the package – this package contains the source code for the Java, Python, and C++ protocol buffer compilers and the classes you need to do I/O and test. To compile and build your compiler, refer to the README file in the code.

Once you have done all the setup, refer to the tutorial page to select the language you need – this will help you create a simple application using the Protocol Buffer.

Introduce proto3

In our latest version 3 release, a new language version, Protocol Buffers Language Version 3 (also known as Proto3), uses some new features for our existing language version (Proto2).

Proto3 simplifies the Protocol Buffer language, making it easier to use and supports more languages: Our current proto3 release allows you to create Java, C++, Python, Java Lite, Ruby, JavaScript, Objective-C, and C#.

You can also create proto3 Go code by using the Go Protoc plugin, which you can download from Golang/Protobuf Github. More languages are gradually being supported.

Please note that the two versions of the API are not completely compatible. To accommodate users who are still using the older version, we will support the older version in the new Protocol Buffers release.

You can see the major differences between the two versions below in the Release Notes. For Proto3 syntax, please refer to the Proto3 Language Guide, full documentation for Proto3 has not yet been written and will come later.

There seems to be some confusion between Proto2 and Proto3 because the original open source Protocol Buffers is actually the second version of the Google internal language, and our open source version also starts with V2.0.0.

To put it simply, the initial version of Proto was used internally by Google. In the second version of ProTO, Google decided to open source, so the open source proTO started from Proto2.

A brief history

Protocol Buffers were originally developed internally at Google to handle request/response protocols on indexing servers.

Before the Protocol buffers, according to the request and response, using the marshalling/unmarshalling, the ability to support a series of agreements. But the results look ugly, for example:

if (version == 3) { ... } else if (version > 4) { if (version == 5) { ... }... }Copy the code

Clearly formatted protocols also make it more difficult to roll out new versions of protocols, because developers must be able to understand how old protocols are handled between servers, as well as the new protocols. You can gradually replace an old protocol with a new one only after understanding both the old and new protocols.

Protocol buffers are designed to address many of the above problems:

  • The new fields are easier to define, intermediate servers process data without checking it, and can transfer data without knowing how it is defined.
  • The format is self-describing, which makes it easier to support more languages (C++, Java, etc.).

However, users still need to manually write their own processing DIam.

As systems evolve, it acquires many other features and uses:

  • Automatically generate serialization and deserialization code instead of writing it manually.
  • In addition to using short-term RPC (remote procedure call) requests, Protocol Buffers are beginning to be used as an efficient self-describing structured data format (primarily for data that exists for short periods, such as in Bigtable).
  • Server RPC interfaces begin to be declared as part of the protocol file, and the protocol compiler generates root classes that users can implement and reload through the server interface.

Protocol Buffers have become the common language for data in Google — over time, more than 348,952.proto files have been defined within Google. These are used to store data in RPC systems and storage systems.

www.ossez.com/t/protocol-…