Hola, I’m yes.

Before understanding Dubbo, it is necessary to analyze a wave of RPC first. It will get twice the result with half the effort to understand THE principle of RPC first and then deeply understand Dubbo.

It is important to understand the core principles that are common to all RPC frameworks on the market.

When you look at Dubbo after you figure out how it works, you get that sense of familiarity and validation that I was talking about.

In fact, RPC is not only used in our daily micro service calls, in many scenarios related to network communication can be used RPC.

For example, message-queue clients interact with brokers and interact with other middleware using RPC.

You might say no? Where is the RPC call?

Heh heh, you see that’s what RPC does, let you complete the remote communication without feeling.

In fact, I already mentioned RPC once in my “hot” article and the difference between HTTP and RPC, but that time it was HTTP.

This time let’s dig into a wave of RPC and understand a wave from the root.

Come on, get in!

The body of the

Remote Procedure Call (RPC) stands for Remote Procedure Call, which corresponds to our local Call.

Remote actually refers to the need for network communication, which can be understood as calling methods on remote machines.

That may say: I use HTTP call is not a remote call, that is not called RPC?

No, the purpose of RPC is to let us call remote methods just as we call local methods.

If you look at the code, it is clear. For example, if the service is not split, it is called locally.

    public String getSth(String str) {  
         return yesService.get(str);  
    }  
Copy the code

If yesSerivce is split out, it needs to be called remotely. If HTTP is used, it might be:

public String getSth(String str) { RequestParam param = new RequestParam(); . return HttpClient.get(url, param,.....) ; }Copy the code

In this case, you need to care about the address of the remote service, you need to assemble the request, etc., and if you use RPC calls, that is:

Public String getSth(String STR) {// Looks the same as before? // The implementation has been moved to another service, where there is only interface. // If you don't know what's going on return yesService.get(str); }Copy the code

Therefore, RPC is actually used to shield the details related to the remote call network, making the remote call and local call use the same, so that the development efficiency is higher.

Now that you know what RPC does, let’s take a look at the steps that RPC calls go through.

Basic flow of RPC calls

In the example above, the yesService service implementation was moved to a remote service, with no specific implementation locally and only one interface.

We need to call yesService.get(STR). What should we do?

All we need to do is inform the remote service via network communication of the parameters passed in and the fully qualified name of the interface called.

The remote service then receives the parameters and fully qualified name of the interface to select the specific implementation and make the call.

The business is processed and then returns the results over the network, and you’re done!

The above operations are triggered by yesService.get(STR).

However, we know that yesService is an interface and there is no implementation, so how do these operations come from?

It comes through dynamic proxies.

RPC generates a proxy class for the interface, so when we call this interface we actually call the dynamically generated proxy class, which triggers the remote call, so we don’t know how to call the remote interface.

Dynamic proxy must be more familiar to everyone, the most common is Spring AOP, involving JDK dynamic proxy and Cglib.

In Dubbo, Javassist was used. As for why I used this, Liang Fei has written a blog about it.

He compared JDK built-in, ASM, CGLIB(based on ASM wrappers), Javassist.

After testing, Javassist was selected.

Liang Fei: The final decision is to use JAVAASSIST’s bytecode generation proxy. ASM is faster, but not an order of magnitude faster, and JAVAASSIST’s bytecode generation method is more convenient than ASM’s. JAVAASSIST simply concatenates Java source code with strings to generate bytecode, whereas ASM needs to manually write section codes.

As you can see, performance is one thing when choosing a framework, and ease of use is also critical.

Back to the RPC.

Now that we know that dynamic proxies mask the details of RPC calls, allowing users to call remote services without being aware of them, what are the details of calls?

serialization

Request parameters like ours are objects, sometimes defined Dtos and sometimes maps, which cannot be transferred directly over the network.

You can think of the object as “solid”, while the data transmitted over the network is “flat”, and eventually needs to be converted into “flat” binary data for transmission over the network.

If you think about it, objects are allocated in different places in memory, references, doesn’t that look like a three-dimensional thing?

In the end, they all have to be transferred to each other as 01 numbers. Doesn’t this kind of 01 numbers look very “flat”?

The process of converting an object to binary data is called serialization, and the process of converting binary data to an object is called deserialization.

Of course, how you choose the serialization format is also important.

For example, the binary serialization format is more compact, and the text serialization format, such as JSON, is more readable and easier to troubleshoot.

There are also many serialization options, which generally require a combination of versatility, performance, readability, and compatibility.

I won’t analyze it in this article, but I’ll write another article on various serialization protocols.

RPC protocol

As mentioned earlier, only binary data can be transferred over the network, and the binaries are connected in the bottom layer, so it doesn’t care which data you request which.

But the receiver needs to know, otherwise the binary data will not be successfully restored to the corresponding requests.

So you need to define a protocol, you need to define some specifications, you need to define some boundaries so that binary data can be restored.

For example, the following string of numbers are identified according to different bits of the result is different.

So the protocol really defines how the binary data is constructed and parsed.

Our parameters are definitely more complex than the ones above, because the length of the parameters is variable, and the protocol often expands with upgrades, after all, sometimes new features need to be added, so the protocol has to change.

Generally, RPC protocols adopt the protocol header + protocol body.

The protocol header puts some metadata, including: magic bit, protocol version, message type, serialization method, overall length, header length, extension bit, etc.

The protocol body puts the requested data.

By looking at the magic bit, we can know whether this is the agreement we agreed on. For example, the magic bit is fixed at 233, so we know it is 233.

Then the version of the protocol is for subsequent protocol upgrades.

From the overall length and header length we know exactly how many bits there are in the request. How many of the first bits are headers, and the rest are protocol bodies, so we can identify them. The extension bits are reserved for later extension.

Post the Dubbo agreement:

You can see the Magic bit, request ID, data length, and so on.

Network transmission

The data is assembled and ready to be sent, and that’s where the network comes in.

Network communication is inseparable from the network IO model.

Network IO is divided into these four models, specific later to write a separate article analysis, this article will not expand.

In general, we use IO multiplexing, because most RPC call scenarios are high concurrency calls, and IO multiplexing can hold a lot of requests with fewer threads.

Generally, RPC frameworks use the wheels already built as the underlying communication framework.

For example, Java language will use Netty, they have been packaged very well, also do a lot of optimization, use, convenient and efficient.

summary

The basic flow of RPC communication has been covered, as shown below:

I’m not drawing the response back, but it’s backwards.

Let me conclude with one more paragraph:

The service caller, oriented to interface programming, uses dynamic proxy to shield the low-level call details to combine the request parameters, interfaces and other data and convert them into binary data through serialization, and then transmits them to the service provider through the network through the encapsulation of RPC protocol.

The service provider parses the request data according to the agreed protocol, deserializes the parameters, finds the specific interface to call, performs the specific implementation, and returns the result.

There are a lot of details.

For example, requests are asynchronous, so each request will have a unique ID, and the result will be returned with the corresponding ID, so that the caller can find the corresponding request by ID and plug in the corresponding result.

Some people ask why asynchrony, it is to improve throughput.

Of course, there are many details, which will be mentioned later in the analysis of Dubbo, combined with the actual middleware experience will be deeper.

True industrial level RPC

The above is just the basic flow of RPC, which is not enough for industrial level use.

Service providers in a production environment are clustered, so there are multiple providers, and machines are dynamically added and removed as traffic flows flow.

Hence the need for a registry as a service for discovery.

The caller can use the registry to obtain meta information such as the IP address of the service providers to make the call.

Callers can also know through the registry that the service provider is offline.

In addition, a routing group policy is required. The caller selects the corresponding service provider based on the delivered routing information, which can implement functions such as group invocation, grayscale advertisement, and traffic isolation.

A load balancing policy is also required. Generally, multiple service providers can be selected after route filtering to balance traffic.

Of course, there is also a need for exception retry, after all, the network is unstable, and sometimes a service provider may have some problems, so a call error retry, less service loss.

Traffic limiting is also required because the service provider does not know how many callers it will access and the call usage of each caller. Therefore, it is necessary to measure the tolerance of its own service to limit traffic and prevent service crash.

The purpose of fusing is to prevent the failure of downstream service from causing its own service call timeout blocking accumulation and crash, especially the call chain is very long, which has a great impact.

For example, A=>B=>C=>D=>E, and then E out of fault, you see ABCD four services on the silly waiting, slowly filled with resources collapse, collapse.

This is basically what has been mentioned above, but it can be refined, such as the various policies for load balancing, whether limiting the total amount of traffic or limiting traffic on a per-caller basis, or adaptive limiting traffic, and so on.

This will come up later when we analyze Dubbo. Wait.

The last

I have a classmate before me with two years of experience. My resume says I am familiar with Spring Cloud Alibaba and then I know Dubbo.

I asked him how RPC was called, and he asked me what RPC was. He had never heard of the term.

It’s too superficial.

It is important to understand the principle, and dynamic proxies like I mentioned above are not necessarily required. For example, C++ does not have dynamic proxies, and the gRPC framework uses code generation.

In the end, the user doesn’t need to care as long as the call details can be shielded. It doesn’t matter how you achieve this.

Also mentioned above is interface oriented, but sometimes there is no interface, such as some service gateways that expose HTTP calls to callers to call back end RPC services.

The gateway is connected to many back-end services, so it is impossible to rely on the back-end interface, otherwise it would be inflexible.

So there’s this notion of a generalization call.

If you understand that the requestor is simply telling the service provider which method I want to call and what the parameters are, you can easily understand what a generalized call is.

So you can understand that you don’t need an interface to make RPC calls.

We’ll talk about what the generalization calls are when we write Dubbo.

So it sounds like a fancy thing, but if you understand the nature of it, that’s all it really is.

All changes are inseparable.

Wechat search [yes training guide], pay attention to yes, reply [123] a 20W word algorithm brush problem notes waiting for you to get. Personal article summary: github.com/yessimida/y… Welcome to star!


I am yes, from a little bit to a billion little bit, welcome to see, forward, leave a message, we will see you next.