What problem does RPC solve

Remote Procedure Call (RPC), I believe everyone is familiar with. If you are not sure, you can look at the definition of wiki:

In distributed computing a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in another address space(commonly on another computer on a shared network), which is coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction.

This definition mentions three things:

Distributed environment.
It is written just like calling a local function.
The programmer does not have to worry about the details of the interaction with the remote.

These three points can also be used to describe the problems that RPC frameworks need to solve.

RPC itself doesn’t involve any technical sophistication, so a quick search on Github will reveal that RPC libraries are just as bad a place to build wheels as web libraries.

That being said, writing toy code can be fun for programmers. I believe that after reading this article, you will also join the army of building RPC library wheels.

Before we get into the subject, let’s remove some ambiguity.

Web development also has the concept of RPC, a class of “action” based apis designed over the HTTP protocol, as opposed to the more popular RESTful:

GET /operation? id=anIdCopy the code

In the novel, RPC is more about “code” RPC.

Server A needs to fetch data from another server to process A request.

var data = serviceDelegate.GetData(id);Copy the code

As the code reads, what GetData does internally is essentially package the service ID, method ID, and parameter into a message and send it out, waiting for the other server to return the message and then calling back to the corresponding method of the initiator.

Three years ago, Mr. Novella was involved in the development of a page game at Tencent. Perhaps for historical reasons, RPC was not used in this project. So as a logic dog, code is a pain to write.

For a simple example, to send a request package with two parameters, a and B, the application layer programmer needs to allocate a request object, then manually assign values to A and B, then manually call the serialization function, and finally send. The operation for receiving packets is similar.

If the project introduced RPC, the request would be a line of code written as a call to a normal function; To receive a request, the framework automatically calls back a function called signature, or package parameter. Application programmers don’t need to worry about “package”, “serialization”, “deserialization”, and write a lot less repetitive code.

Of course, RPC is by no means perfect. Those of you who have read The Art of Unix Programming may remember the chapters criticizing RPC in the book.

The RPC interface does not describe itself.

RPC is too easy to extend and increases system complexity.

Because of poor RPC transparency, programmers cannot directly know the call cost of an RPC interface.

RPC encourages programmers to treat cross-machine calls as cost-free.

Simply put, RPC is heavily influenced by the “people” factor. RPCS look exactly the same as native functions, so they can be spread around the system by unscrupulous programmers.

If it’s just upper-level logic that’s fine, there’s a sandbox mechanism that prevents cascading effects. However, when Reviewing the framework recently, I found that the previous maintainer used RPC to go around a lot of main processes at the framework level, which simply confused the laters.

RPC was meant to ease the programmer’s coding burden, but in doing so, it completely defeats its design purpose.

Yunfeng wrote a blog “The Evil of RPC”, in which he also mentioned an example of RPC abuse: the built-in sort function of the system library often requires the user to pass in comparer, some programmers will call RPC in Comparer, and even the subsequent execution of comparer also needs the remote return result of RPC.

This is clearly not true. “Processes” take data, “algorithms” process it; Rather than “algorithm” in the execution process according to the “flow” to obtain data, and then processing data.

However, ruled out the “human” factor, the benefit from the RPC is fairly obvious, especially for game project or application back-end development of the project, the service definition control, process control, even if the aforementioned problems, or is able to quickly locate to the problem, or is not bring disastrous influence to project.

Whether to use RPC in a project is a matter of opinion. The rest of this article looks at what RPC frameworks need to solve and how to design them.

What problems does the RPC framework need to solve?

Let’s look at the three points mentioned at the beginning of the article:

Distributed environment.

It is written just like calling a local function.

The programmer does not have to worry about the details of the interaction with the remote.

So let’s expand it out one by one.

First, distributed environment, that is to say, RPC framework needs to help programmers do a good job of method call to the data conversion, and then send out with the help of the network library; The network library on the receiving side pushes the data to the RPC framework on the receiving side, which then converts the data to method callbacks for the programmer.

The process is very simple, the network library related implementation can refer to the first article in the server series “writing from Scratch server framework”; Method calls and data interchange is even easier — find a way to automate the unpacking logic of the human flesh, and then serialize it.

There are usually two kinds of serialization schemes: one is self-describing, common in third-party libraries such as Protobuf and MsgPack; One is based on pure data flow, usually maintained by the project team itself.

Of course, a complete RPC framework should be able to replace serialization schemes transparently.

Second, it’s written just like calling a local function. This determines the quality of RPC framework design.

The core design intent of the RPC framework is to make it very natural for application layer programmers to call without too much baggage. World of Warcraft and many netease games use the BigWorld-like server framework, and its RPC even hides the details of the client session cutting process from the upper programmers.

Local functions can be roughly divided into two categories: synchronous and asynchronous.

For asynchronous functions, basically any language and platform can achieve RPC and local functions written in the same way, for example, the following two method calls, can not distinguish which calls need to send network packets.

remoteServiceDelegate.PostMessage(msg); localServiceDelegate.PostMessage(msg);Copy the code

Synchronization functions can be tricky to implement, or even impossible on some languages or platforms.

Let’s look at an example of a call to a local synchronization function:

var ret = localServiceDelegate.SendMessage(msg);Copy the code

Now, let’s change SendMessage to an asynchronous RPC call that requires network communication.

To achieve semantically consistent local function versions, the language or platform would need to have the ability to save the execution context and wait until the remote result of SendMessage is returned to restore the context and assign the returned result to ret.

In languages/platforms that support asynchronous syntax, this semantics can be natively supported, written like this:

var ret = await localServiceDelegate.SendMessage(msg); / / support async/await semantic var ret = yield localServiceDelegate. SendMessage (MSG); // Support yield semanticsCopy the code

If asynchronous syntax is not supported, but closures are supported, you can also write:

localServiceDelegate.SendMessage(msg, (ret) =>
{
});Copy the code

If closures aren’t supported, you’re in trouble.

Both of these approaches are close to local function calls because of the semantic guarantee that when asynchronous data is returned, the execution context is the same as when the asynchronous request is sent.

Closures are not supported, and if you want to save the scene, you need to define the execution context structure — hold the relevant environment when you make a request, pull it out when you receive a response, and call back the registered callback function.

So just writing the application layer, I think it’s too far from modern programming to use a language that doesn’t even have native support for closures.

Third, programmers don’t need to focus on the details of remote interactions.

What are interaction details?

Recall that Novella introduced “gateways,” “message queues,” “Data services,” and “distributed consistency facilities” in his previous server-side articles.

These facilities are fundamentally different from “external facilities” (such as third-party SDKS), each of which focuses on modeling and solving a specific class of problems, hence the novel’s term “infrastructure abstraction.”

Dealing with these infrastructure abstractions requires attention to the details of how they interact. For example, you need to specify a multicast group ID when dealing with a gateway, a channel when dealing with a message queue, and much more when dealing with a data service.

If all of these details are exposed at the application layer, the burden on the application layer programmer is greatly increased. Therefore, RPC frameworks should also address these issues by providing an additional middle layer for programmers to understand and unify.

We use the concept of “service” to unify cognition.

For the same service, the caller needs to use the service’s Delegate to initiate an RPC provided by the service. The receiver has a corresponding Implementation for the service, and RPC calls back to the corresponding method of the Impl.

Different infrastructures abstract protocols at different facility layers, so RPC frameworks also need to customize adaptors for different protocols. The Concept of an adapter is defined for the RPC layer — in the case of Implementation, an Adaptor is a continuous flow of output messages; To a Delegate, an Adaptor is a transmitter that accepts messages.

At the same time,The application layer does not need to have a unified Adaptor concept, so Adaptor can provide specialized interfaces to the application layer.

Let’s look at some implementation details.

The first is the protocol definition of RPC layer, which is divided into two parts:

One part is used to identify a call session. The caller assigns a sessionId, and the implementation returns data with a sessionId. The caller can call back previously registered closures, restore context, and manage timeouts.
Some are used for method dispatches. Whether you use a third-party serialization library or maintain it yourself, you need to serialize at least the method Id, parameters, and other information.

As for the overall message flow, it is:

Applications -> RPC -> Adaptor -> Infrastructure Protocols -> Adaptor -> RPC -> Applications

Among them, the relationship between Adaptor and RPC layer meets the following points:

The RPC layer is completely unrelated to the Adaptor layer.Different adaptors need to provide the same interface for the RPC layer.The two do not care about implementation.
A service’s Delegate construct can be constructed based on different types of Adaptor, can send messages to an Adaptor, and can customize service-specific routing rules.
The Implementation of a service can be registered on different Adaptors.

I was going to stop the series on the server side, but it turned out that RPC was already too long, and there was still about half of it left, so I had to force a continuation.

The implementation of RPC itself is very easy, I believe that those who have done it are well aware of. If the method dispatch part of the protocol is text-based, in dynamic languages like Python and Lua, an RPC library can be done in a hundred lines of code without even automation tools.

Now, what we’re talking about is just plain old remote method calls, and while the protocol details of other facilities are completely hidden from the application layer, the powerful features of other facilities are not available to us.

There is another abstraction, parallel to RPC, to specialize the form of RPC, which together with RPC makes up the development specification for the application layer.

In the next article, we will talk about the model definition of message flow and the pattern definition of communication between nodes.

If you have any questions about this article or are interested in a new topic, please feel free to comment directly!

Links to the server series of articles, and subsequent topics (preferably in order) :

Handwritten server frame from zero

Middleware oriented development pattern

How to set up data service quickly

Server architecture for microservices

Message queue-centric server architecture

Talk about stateless services

Talk about distributed locking

Build data services based on Redis

What problem RPC solves

Talk about message flow model ()

Personal subscription number: GameDev101 “For you game Developer”, talk about server, talk about game development.

Related Posts

The RateLimiter flow limiting principle and the leaky bucket algorithm and the token bucket algorithm provided by Guava

Differences between TCP sockets and Web sockets

If anyone asks you what synchronized is, send him this article.