Explain the fundamentals of gRPC and Thrift, and how both are selected.
preface
Before, gRPC and Thrift were only used. Although I had a preliminary understanding of their implementation process, I forgot them after a long time. If I were asked to evaluate the selection of the two, I would be even more confused. So I want to sort out the knowledge I learned before to fill in my own knowledge blind spot.
Rpc review
For more information, see “Basic RPC Series 1: Talking about RPC”.
The goal of the RPC framework is to make remote service calls simple and transparent. The RPC framework is responsible for masking the underlying transport (TCP or UDP), serialization (XML/Json/ binary), and communication details. A service caller can invoke a remote service provider just as if it were a local interface, without caring about the underlying communication details and invocation procedures.
gRPC
GRPC profile
GRPC is a high-performance, general-purpose open source RPC framework designed by Google in 2015 mainly for mobile application development and based on HTTP/2 protocol standard. It is developed based on ProtoBuf serialization protocol and supports many development languages.
Since it is an open source framework, both sides of communication can carry out secondary development, so the communication between the client and the server side will be more focused on the business level content, reducing the attention to the underlying communication realized by the gRPC framework.
As shown in the figure below, the DATA part is the business-level content, and all the information below is encapsulated by gRPC.
GRPC characteristics
- Language neutral, support for multiple languages;
- Define services based on IDL files and use Proto3 tools to generate data structures, server interfaces, and client stubs for the specified language.
- The communication protocol is based on the standard HTTP/2 design, supporting bidirectional flow, message header compression, single TCP multiplexing, server push and other features, these features make gRPC more power saving and network traffic saving on mobile devices;
- Serialization supports Protocol Buffer (PB) and JSON. PB is a high-performance language-independent serialization framework based on HTTP/2 + PB, which ensures high performance of RPC calls.
GRPC interaction process
- After the gRPC function is enabled, the switch acts as the gRPC client, and the collection server acts as the gRPC server.
- The switch will build the corresponding data format (GPB/JSON) according to the subscribed events and write the proto file through Protocol Buffers. The switch will establish a gRPC channel with the server and send request messages to the server through the gRPC Protocol.
- After receiving the request message, the server interprets the proto file through Protocol Buffers and restores the data structure with the first defined format for service processing.
- After data processing, the server needs to use Protocol Buffers to recompile the response data and send the response message to the switch through gRPC.
- After receiving the reply message, the switch terminates the gRPC interaction.
Simply put, gRPC is to establish a connection between the client and the server after the FUNCTION of gRPC is enabled, and push the subscription data configured on the device to the server. As you can see, the whole process is to define the structured data that needs to be processed in the PROto file using Protocol Buffers.
What is Protocol Buffers?
ProtoBuf, as you can see, is a more flexible and efficient data format, similar to XML and JSON, which is ideal for high performance and fast response data transfer scenarios. ProtoBuf has three main functions in the gRPC framework:
- Defining data structures
- Define the service interface
- The transmission efficiency is improved by serialization and deserialization
Why does ProtoBuf improve transport efficiency?
We know that when compiling data with XML or JSON, the text format of the data is easier to read, but when exchanging data, the device requires a lot of CPU I/O action, which naturally affects the transmission rate. Protocol Buffers is unlike the former in that it serializes strings and then transmits them, which is binary data.
As you can see, the content is not very different and very intuitive, but the Protocol Buffers code content is only for the operator to read, and what is actually transmitted is not in this text form, but in serialized binary data. The number of bytes is much less than that of JSON and XML, and the speed is much faster.
How to support cross-platform, multi-language?
It is also an advantage that Protocol Buffers comes with a compiler. The proto file mentioned earlier is compiled by the compiler, and the proto file needs to be compiled to produce a similar library, based on which the data application can be developed. What programming language does the library compile in? Since the o&M staff responsible for network devices and server devices on the live network are often not the same group, o&M staff may be used to using different programming languages for o&M development. In this case, one of the advantages of Protocol Buffers can be exploited — cross-language.
From the above introduction, we can conclude that Protocol Buffers has advantages over JSON and XML in terms of coding:
- Simple, small size, data description file size only 1/10 to 1/3;
- Transmission and parsing speed is fast, compared with XML, parsing speed is 20 times higher;
- Compilers are strong.
Design based on HTTP 2.0 standard
In addition to Protocol Buffers, gRPC has another advantage as you can see from the interaction diagram and the hierarchical framework – it is based on the HTTP 2.0 Protocol.
Because gRPC is designed based on the HTTP 2.0 standard, it brings more powerful features such as multiplexing, binary frames, header compression, and push mechanisms. These functions bring significant benefits to the device, such as saving bandwidth, reducing the number of TCP connections, and saving CPU usage. GRPC can be applied in both the client and the server, so as to realize the communication between the two ends and simplify the construction of the communication system in a transparent way.
The HTTP version is divided into HTTP 1.X and HTTP 2.0. HTTP 1.X is the most widely used HTTP protocol. HTTP 2.0 is called the second generation of hypertext transfer protocol. HTTP 1.X defines four ways to interact with the server: GET, POST, PUT, and DELETE, which are retained in HTTP 2.0. What’s new in HTTP 2.0:
- Bidirectional streaming, multiplexing
- Binary frame
- The head of compression
Thrift
Introduction of Thrift
Thrift is a scalable RPC software framework for cross-language services. It combines a powerful software stack code generation engine to build services that can be used efficiently and seamlessly across multiple languages. Contributed by Facebook to the Apache Fund in 2007, it is a top project under Apache and has the following characteristics:
- Support for multiple languages: C, C++, C#, D, Delphi, Erlang, Go, Haxe, Haskell, Java, JavaScript, Node.js, OCaml, Perl, PHP, Python, Ruby, SmallTalk
- Message definition files support annotations, separation of data structures from transport representations, and support for multiple message formats
- Contains a complete client/server stack for quick implementation of RPC and support for synchronous and asynchronous communication
Thrift frame construction
Thrift is a set of RPC (Remote Service Invocation) frameworks that include serialization capabilities and support for service communication. It is also a microservice framework. Its main feature is that it can be used across languages, which is the most attractive part of the framework.
In the figure, code is the business logic implemented by the user, and the following service. Client and write()/read() are the Client and Server codes generated by Thrift according to IDL, corresponding to the Client Stub and Server stub in RPC. TProtocol is used to serialize and deserialize data in binary, JSON, or any format defined by Apache Thrift. TTransport provides data transfer capabilities, and using Apache Thrift makes it easy to define a service and choose different transport protocols.
Thrift network stack structure
Thirft uses sockets for data transmission. Data is sent in a specific format and parsed by the receiver. Once we have defined thrift’s IDL file, we can use thrift’s compiler to generate interfaces and models for both languages. In the generated Models and interface code, there will be code to decode and encode. The structure of thrift network stack is as follows:
The Transport layer
Represents Thrift’s data transfer mode. Thrift defines the following common data transfer modes:
- TSocket: blocking socket;
- TFramedTransport: Transmission in the unit of frame, used in non-blocking services;
- TFileTransport: Transmits files.
TProtocol layer
Represents the protocol for transferring data between Thrift’s client and server, which is generally referred to as the format (such as JSON, etc.) for transferring data between thrift’s client and server. Thrift defines several common formats:
- TBinaryProtocol: binary format.
- TCompactProtocol: Compression format;
- TJSONProtocol: JSON format;
- TSimpleJSONProtocol: Provides a write-only JSON protocol.
Server model
- TSimpleServer: Simple single-threaded service model, often used for testing;
- TThreadPoolServer: Multi-threaded service model, using standard blocking IO;
- TNonBlockingServer: Multi-threaded service model that uses non-blocking IO(TFramedTransport is required);
- THsHaServer: THsHa introduces a thread pool to handle the read/write tasks. Half-sync/ half-async mode is used to handle IO events (accept/read/write IO). Half-sync is used by handler to synchronize RPC.
gRPC VS Thrift
Functional comparison
Two screenshots posted directly on the web:
Performance comparison
Also based on the results of online testing, for reference only:
- On the whole, the performance of the long connection is better than that of the short connection, and the performance gap is more than twice.
- Compared with the two RPC frameworks of Go language, Thrift’s performance is obviously better than gRPC, and the performance gap is more than twice.
- Compared with the two languages in the Thrift framework, the RPC performance of Go and C++ is about the same in the case of long links, and the performance of Go is about twice that of C++ in the case of short links.
- Compared with TSimpleServer and TNonblockingServer under Thrift&C++, TNonblockingServer performs worse than TSimpleServer because of thread management overhead in the scenario of single-process client long connection. However, in the case of short connection, the main cost is the connection establishment, and the thread pool management cost can be ignored.
- Both RPC frameworks and both languages are very stable, with 5W requests taking about five times longer than 1W;
How to choice
When to choose gRPC over Thrift:
- Need good documentation, examples
- Like, get used to HTTP/2, ProtoBuf
- Is sensitive to the network transmission bandwidth
When to choose Thrift over gRPC:
- Data needs to be exchanged between a very large number of languages
- Sensitive to the CPU
- The protocol layer and transport layer have various control requirements
- A stable version is required
- Good documentation and examples are not required
conclusion
This article should be very detailed about the characteristics and differences between gRPC and Thrift. So far, I have not found a better summary than mine, except for the source code interpretation. (PERSONALLY, I don’t recommend reading the source code first, as long as we know the implementation process and the differences, it is easy for us to use and choose the type.)
To sum up the whole article, it can be summarized as follows:
- The main thing about GRPC is ProtoBuf, and then HTTP, so there’s no duplication of the protocol, the emphasis is on ProtoBuf.
- Thrift’s data format is off-the-shelf, but it builds its own wheels on both the transport layer and the server, so it can have multiple control requirements on the protocol layer and transport layer.
Welcome to more like, more articles, please pay attention to the wechat public number “Lou Zai advanced road”, point to pay attention, don’t get lost ~~