Protocol Buffers (Protobuf) is a language-independent, platform-independent, extensible way of serializing structured data — similar to XML, but more flexible and efficient than XML. Although protobuf is often used in daily work, most of the time, it only stays in the use of basic syntax, and many advanced features and syntax are not fully mastered. When reading some open source Proto libraries, I will always see some syntax that has not been used in daily life, affecting my understanding.

This article is based on the Go language, summarizes all the proto3 common and uncommon syntax and examples, to help you fully grasp the Protobuf syntax, deepen understanding, clear the source code reading obstacles.

Quick Start

The basic working principle of a protobuf is to write xxx.proto files using protobuf syntax and then compile them into code files that can be recognized and used by a specific language for program calls.

In the Go language, for example, the official compiler will compile the xxx.proto file into the XXX.pb. Go file, which is a normal Go code file. To use protobuf, we need to download the protobuf compiler — Protoc, but Go is not supported by the compiler directly, it is referenced by the compiler as a plug-in, so we also need to download the Go compiler plug-in:

  1. Download the compiler for the appropriate environment (protoc-$VERSION-$PLATFORM.zip) :Github.com/protocolbuf…
  2. Download and install the Go compiler plug-in:go install google.golang.org/protobuf/cmd/protoc-gen-go

    After installation, we prepare the following files$SRC_DIR/quick_start.proto:
syntax = "proto3";

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
}
Copy the code

Protoc –go_out=$DST_DIR $SRC_DIR/quick_start.proto This command compiles the $SRC_DIR/quick_start.proto file and saves its compilation output based on the Go language to the file $DST_DIR/quick_start.qb. Go:

.type SearchRequest struct {
	Query                string   `protobuf:"bytes,1,opt,name=query,proto3" json:"query,omitempty"`
	PageNumber           int32    `protobuf:"varint,2,opt,name=page_number,json=pageNumber,proto3" json:"page_number,omitempty"`
	ResultPerPage        int32    `protobuf:"varint,3,opt,name=result_per_page,json=resultPerPage,proto3" json:"result_per_page,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_unrecognized     []byte   `json:"-"`
	XXX_sizecache        int32    `json:"-"`}...Copy the code

The structure can be protobuf serialized and deserialized by introducing the package in the build file quick_start.qb.go into the program. Serialization:

req := &pb.SearchRequest{} // pb is the alias of the quick_start.qb.go package
// ...

// Serialize the structure, write to the file
out, err := proto.Marshal(req)
iferr ! =nil {
        log.Fatalln("Failed to encode search request :", err)
}
if err := ioutil.WriteFile(fname, out, 0644); err ! =nil {
        log.Fatalln("Failed to write search request:", err)
}
Copy the code

Deserialization:

// Reads the message from a file and deserializes it into a structure
in, err := ioutil.ReadFile(fname)
iferr ! =nil {
        log.Fatalln("Error reading file:", err)
}
book := &pb.SearchRequest{}
iferr := proto.Unmarshal(in, book); err ! =nil {
        log.Fatalln("Failed to parse search request:", err)
}
Copy the code

A Bit of Everything

The quick Start example shows the basic usage, but let’s go through the syntax and functionality of the Protobuf through an example that includes all of the Proto3 syntax. The sample code can be found here: a_bit_of_everything.proto executes protoc –go_out=plugins= GRPC:.a_bit_of_everything. proto generates the xxx.pb.go file.

package

syntax = "proto3";
option go_package = "examplepb";  // The compiled golang package name
package example.everything; / / proto package name.Copy the code

At the beginning of the sample file, you’ll see two package declarations, go_package and package, but they don’t mean the same thing. Package example.everything; The package name in which the.proto file resides is similar to the Go language. You cannot define the same package name as message, enum, or service. Option go_package = “examplepb” defines a file-level option that specifies the compiled Golang package name.

import

.import "google/protobuf/any.proto";
import "google/protobuf/descriptor.proto";
//import "other.proto";.Copy the code

Import is used to import other proto files. If you want to use the definitions of other Proto files in the current file, you need to import them. Then you can reference the content in a way similar to packageName. For example, protoc -i $GOPATH/ SRC –go_out=.a_bit_of_everything. proto

Proto and Descriptor. proto are already built into protoc, so you do not need to add -i to compile this example

Scalar Value Types

Proto type Go type note
double float64
float float
int32 int32 Encoding negative values is relatively inefficient
int64 int64 Encoding negative values is relatively inefficient
uint32 uint32
uint64 uint64
sint32 int32 When the value is negative, encoding is more efficient than INT32
sint64 int64 When the value is negative, encoding is more efficient than INT64
fixed32 uint32 When the value is always greater than 2^28, the encoding is more efficient than the uint32
fixed64 uint64 When the value is always greater than 2^56, the encoding is more efficient than the uint32
sfixed32 int32
sfixed64 int64
bool bool
string string It must be UTF-8 or 7-bit ASCII text, and must be no longer than 2^32
bytes []byte A sequence of bytes of any length not greater than 2^32

The message the message

// Plain message
message SearchRequest {
    string query = 1;
    int32 page_number = 2;
    int32 result_per_page = 3;
}
Copy the code

Message can contain multiple field declarations, each containing the field type, field name, and a unique sequence number. The field type can be a scalar, enumeration, or other Message type. A unique ordinal identifies the position of the field in the message binary encoding.

You can also modify field types with repeated, as described in repeated below.

Enumerated type

./ / the enumeration enum
enum Status {
    STATUS_UNSPECIFIED = 0;
    STATUS_OK  = 1;
    STATUS_FAIL= 2;
    STATUS_UNKNOWN = -1; // Negative numbers are not recommended}...Copy the code

An enumeration type is defined by the enum keyword. In protobuf, the enumeration is an INT32 type. The first enumerated value must start at 0, and if you do not want to use a 0 value in your code, you can use the first value as XXX_UNSPECIFIED as a placeholder. Because enum types are actually encoded using the int32 protobuf encoding, negative numbers are not recommended for enumeration types.

XXX_UNSPECIFIED is simply a code specification. Does not affect code behavior.

Reserved Fields & Reserved enumeration Values

// Reserve the field
message ReservedMessage {
    reserved 2.15.9 to 11;
    reserved "foo"."bar";
    // string abc = 2; // Compile error
    // string foo = 3; // Compile error
}
// Keep enumerations
enum ReservedEnum {
    reserved 2.15.9 to 11.40 to max;
    reserved "FOO"."BAR";
    // FOO = 0; // Compile error
    F = 0;
}
Copy the code

If we remove fields from a message, they may be reused in future updates. Codec errors can occur when both the old and new PROTO definitions are running online. For example, there are two versions of Foo, old and new:

// old version
message Foo {
    string a = 1;
}
Copy the code
// new version
message Foo {
    int32 a = 1;
}
Copy the code

If you use the newer version of PROto to parse the older message, you get an error because the newer version tries to parse a to int32, but the older version actually encodes A as string. Protobuf avoids the problem of old vs. new versions by providing the reserved keyword:

// new version
message Foo {
    reserved 1; // mark the first field as reserved
    int32 a = 2; // The sequence number starts at 2 and does not conflict with the old version of string a
}
Copy the code

nested

// Nested message
message SearchResponse {
    message Result {
        string url = 1 ;
        string title = 2;
    }
    enum Status {
        UNSPECIFIED = 0;
        OK  = 1;
        FAIL= 2;
    }
    Result results = 1;
    Status status = 2;
}
Copy the code

Message allows multiple levels of nesting, and both Message and enum can be nested. Nested messages and enums can be used not only in the current message, but also referenced by other messages:

message OtherResponse {
    SearchResponse.Result result = 1;
    SearchResponse.Status status = 2;
}
Copy the code

The compound type

In addition to scalar types, Protobuf provides a number of non-scalar types, which I refer to collectively as compound types throughout this article.

Compound types are not officially classified categories. Is a concept summed up in this paper for easy understanding.

repeated

// repeated
message RepeatedMessage {
    repeated SearchRequest requests = 1;
    repeated Status status = 2;
    repeated int32 number = 3;
}
Copy the code

Repeated can be applied to variable types in message. Only scalar types, enumerated types, and message types can be repeated. Repeated means that the modifier can be repeated any number of times (including 0), and is an array of varying lengths representing the modifier type.

// repeated
type RepeatedMessage struct {
	Requests             []*SearchRequest `protobuf:"bytes,1,rep,name=requests,proto3" json:"requests,omitempty"`
	Status               []Status         `protobuf:"varint,2,rep,packed,name=status,proto3,enum=example.everything.Status" json:"status,omitempty"`
	Number               []int32          `protobuf:"varint,3,rep,packed,name=number,proto3" json:"number,omitempty"`
	XXX_NoUnkeyedLiteral struct{}         `json:"-"`
	XXX_unrecognized     []byte           `json:"-"`
	XXX_sizecache        int32            `json:"-"`}Copy the code

map

message MapMessage{
    map<string, string> message = 1;
    map<string, SearchRequest> request = 2;
}
Copy the code

In addition to Slice, there’s map, of course. Key can be of any scalar type except double,float, or bytes, and value can be any scalar type, enumeration type, or message type. Protobuf maps are also used when compiled into Go:

. // maptype MapMessage struct {
	Message              map[string]string         `protobuf:"bytes,1,rep,name=message,proto3" json:"message,omitempty" protobuf_key:"bytes,1,opt,name=key,proto3" protobuf_val:"bytes,2,opt,name=value,proto3"`
	Request              map[string]*SearchRequest `protobuf:"bytes,2,rep,name=request,proto3" json:"request,omitempty" protobuf_key:"bytes,1,opt,name=key,proto3" protobuf_val:"bytes,2,opt,name=value,proto3"`
	XXX_NoUnkeyedLiteral struct{}                  `json:"-"`
	XXX_unrecognized     []byte                    `json:"-"`
	XXX_sizecache        int32                     `json:"-"`}...Copy the code

any

. import "google/protobuf/any.proto"; . message AnyMessage { string message = 1; google.protobuf.Any details = 2; }...Copy the code

The any type can contain an arbitrary serialized message that does not need to be typed. To use any types, need to import the Google/protobuf/any proto. Encode /decode of any fields is handed over to each language’s runtime. For example, in Go, we can read and write any fields as follows:

.import "github.com/golang/protobuf/ptypes".func getSetAny(a) {
	fmt.Println("getSetAny")
	req := &examplepb.SearchRequest{
	    Query: "query",}// Package SearchRequest as Any
	a, err := ptypes.MarshalAny(req)
	iferr ! =nil {
	    log.Println(err)
	    return
	}
	/ / assignment
	anyMsg := &examplepb.AnyMessage{
	    Message: "any message",
	    Details: a,
	}
	
	req = &examplepb.SearchRequest{}
	Restore the proto message from Any
	err = ptypes.UnmarshalAny(anyMsg.Details, req)
	iferr ! =nil {
	    log.Println(err)
	}
	fmt.Println("	any:", req)
}
Copy the code

one of

// one of
message OneOfMessage {
    oneof test_oneof {
        string m1 = 1;
        int32 m2 =2; }}Copy the code

If a message contains multiple fields, but only oneof these fields can be set at a time, oneof can ensure this behavior. Setting a value for any field in oneof empties out the other fields. For example, for the above example, the test_oneof field is either m1 of type string or M2 of type int32. An example of reading and writing oneof in the Go language is as follows:

func getSetOneof() {
	fmt.Println("getSetOneof"Oneof: = examplepb.OneOfMessage{// Only one value can be set at a time TestOneof: &examplepb.OneOfMessage_M1{M1:"this is m1",
		},
	}
	fmt.Println("	m1:", oneof.GetM1())  // this is m1
	fmt.Println("	m2:", oneof.GetM2()) // 0
}
Copy the code

options & extensions

I believe that most gopher seldom pay attention to options when using Protobuf, and 80% of the development work does not need to use options directly. Options, however, is a useful feature that greatly increases the extensibility of Protobuf, and it’s worth knowing about. Options are protobuf built-in message types, which fall into the following levels:

  • File level (file-level options)
  • Message-level options
  • Field level Options
  • Service options
  • Method level (Method Options)

Protobuf provides built-in options and extends them with the extend keyword to add custom options.

In Proto2 syntax, extend applies to any message, but in Proto3 syntax, extend applies only to those option-defining messages — only for custom options.

Options does not change the overall meaning of the declaration (i.e. Int32 is int32, and does not change the declaration type for an option), but may affect how it is handled in a particular case. For example, we can mark a field as deprecated using the built-in deprecated option:

message Msg {
    string foo = 1;
    string bar = 2 [deprecated = true]; // Marked as deprecated. }Copy the code

When we need to write custom Protoc plug-ins, we can use custom options to provide additional information for compiling the plug-in. For example, suppose I wanted to develop a proto validation plug-in that generates the xxx.validate () method to Validate the message. I could customize options to provide the necessary information to generate the code:

Message Msg {// required is a custom options, indicating that the foo field must not be empty string foo = 1; [required =true]; 
}
Copy the code

Built-in options can be defined at github.com/protocolbuf… Each level of options corresponds to a message, respectively:

  • FileOptions — File level
  • MessageOptions — Message level
  • FieldOptions — Field level
  • ServiceOptions — Service level
  • MethodOptions — Method level

The following walks you through each of these levels of options and how to extend them with examples.

File level

.option go_package = "examplepb";  // The compiled golang package name.message extObj {
    string foo_string= 1;
    int64 bar_int=2;
}
// file options
extend google.protobuf.FileOptions {
    string file_opt_string = 1001;
    extObj file_opt_obj = 1002;
}
option (example.everything.file_opt_string) = "file_options";
option (example.everything.file_opt_obj) = {
    foo_string: "foo"
    bar_int:1
};
Copy the code

Go_package is no doubt supplied with Protobuf built-in to specify the compiled Golang package name. In addition to using the built-in FileOptions, we can extend the built-in FileOptions by using the extend field. For example, in the above example, we added two new options — file_OPt_string of type string and file_opt_obj of type extObj. And set two file-level options with the option keyword. In Go, we can read these options like this:

func getFileOptions(a) {
	fmt.Println("file options:")
	msg := &examplepb.MessageOption{}
	md, _ := descriptor.MessageDescriptorProto(msg)
	stringOpt, _ := proto.GetExtension(md.Options, examplepb.E_FileOptString)
	objOpt, _ := proto.GetExtension(md.Options, examplepb.E_FileOptObj)
	fmt.Println("	obj.foo_string:", objOpt.(*examplepb.ExtObj).FooString)
	fmt.Println("	obj.bar_int", objOpt.(*examplepb.ExtObj).BarInt)
	fmt.Println("	string:", *stringOpt.(*string))}Copy the code

Print result:

file options:
	obj.foo_string: foo
	obj.bar_int 1
	string: file_options
Copy the code

The message level

// message options
extend google.protobuf.MessageOptions {
    string msg_opt_string = 1001;
    extObj msg_opt_obj = 1002;
}
message MessageOption {
    option (example.everything.msg_opt_string) = "Hello world!";
    option (example.everything.msg_opt_obj) = {
        foo_string: "foo"
        bar_int:1
    };
    string foo = 1;
}
Copy the code

Similar to the file level, not to be repeated. Go language reading examples:

func getMessageOptions(a) {
	fmt.Println("message options:")
	msg := &examplepb.MessageOption{}
	_, md := descriptor.MessageDescriptorProto(msg)
	objOpt, _ := proto.GetExtension(md.Options, examplepb.E_MsgOptObj)
	stringOpt, _ := proto.GetExtension(md.Options, examplepb.E_MsgOptString)
	fmt.Println("	obj.foo_string:", objOpt.(*examplepb.ExtObj).FooString)
	fmt.Println("	obj.bar_int", objOpt.(*examplepb.ExtObj).BarInt)
	fmt.Println("	string:", *stringOpt.(*string))}Copy the code

Field level

// field options
extend google.protobuf.FieldOptions {
    string field_opt_string = 1001;
    extObj field_opt_obj = 1002;
}
message FieldOption {
    // Custom option
    string foo= 1 [(example.everything.field_opt_string) = "abc",(example.everything.field_opt_obj) = {
        foo_string: "foo"
        bar_int:1
    }];
    // Protobuf built-in option
    string bar = 2 [deprecated = true];
}
Copy the code

The field-level option definition does not use the option keyword. The format is: a comma-delimited array of k= V wrapped in []. In Go, we can read these options like this:

func getFieldOptions(a) {
	fmt.Println("field options:")
	msg := &examplepb.FieldOption{}
	_, md := descriptor.MessageDescriptorProto(msg)
	stringOpt, _ := proto.GetExtension(md.Field[0].Options, examplepb.E_FieldOptString)
	objOpt, _ := proto.GetExtension(md.Field[0].Options, examplepb.E_FieldOptObj)
	fmt.Println("	obj.foo_string:", objOpt.(*examplepb.ExtObj).FooString)
	fmt.Println("	obj.bar_int", objOpt.(*examplepb.ExtObj).BarInt)
	fmt.Println("	string:", *stringOpt.(*string))}Copy the code

Application project Reference: github.com/mwitkow/go-… Go-proto-validators is a proto compiler plug-in that generates validations for PROTO messages using field-level options to define validation rules.

Service and Method levels

// service & method options
extend google.protobuf.ServiceOptions {
    string srv_opt_string = 1001;
    extObj srv_opt_obj = 1002;
}
extend google.protobuf.MethodOptions {
    string method_opt_string = 1001;
    extObj method_opt_obj = 1002;
}
service ServiceOption {
    option (example.everything.srv_opt_string) = "foo";
    rpc Search (SearchRequest) returns (SearchResponse) {
        option (example.everything.method_opt_string) = "foo";
        option (example.everything.method_opt_obj) = {
            foo_string: "foo"
            bar_int: 1
        };
    };
}
Copy the code

Options at the service and method levels are also defined by the option keyword, similar to file-level and message-level options. Go language reading examples:

func getServiceOptions(a) {
	fmt.Println("service options:")
	msg := &examplepb.MessageOption{}
	md, _ := descriptor.MessageDescriptorProto(msg)
	srv := md.Service[1] // ServiceOption
	stringOpt, _ := proto.GetExtension(srv.Options, examplepb.E_SrvOptString)
	fmt.Println("	string:", *stringOpt.(*string))}func getMethodOptions(a) {
	fmt.Println("method options:")
	msg := &examplepb.MessageOption{}
	md, _ := descriptor.MessageDescriptorProto(msg)
	srv := md.Service[1] // ServiceOption
	objOpt, _ := proto.GetExtension(srv.Method[0].Options, examplepb.E_MethodOptObj)
	stringOpt, _ := proto.GetExtension(srv.Method[0].Options, examplepb.E_MethodOptString)
	fmt.Println("	obj.foo_string:", objOpt.(*examplepb.ExtObj).FooString)
	fmt.Println("	obj.bar_int", objOpt.(*examplepb.ExtObj).BarInt)
	fmt.Println("	string:", *stringOpt.(*string))}Copy the code

Application project Reference: github.com/grpc-ecosys… Grpc-gateway expresses the conversion relationship from GRPC to HTTP by defining options for RPC method, and controls the behavior of generating Swagger through options at file level and service level.

reference

Developers. Google. Cn/protocol – bu… Developers. Google. Cn/protocol – bu… Github.com/mwitkow/go-… Github.com/grpc-ecosys…