Click “like” to see, form a habit, wechat search [three prince Aobing] pay attention to the programmer who likes to write feelings.
This article has been included in GitHub github.com/JavaFamily, there are a line of large factory interview complete test sites, information and my series of articles.
preface
I have taken you through the service exposure and service introduction of the two processes, and these two processes are for service invocation, today C will take you to go over the Dubbo service invocation process.
After watching the service invocation process today, basically the core process of Dubbo will be completely connected in series. In my mind, I should have the concept of Dubbo’s overall operation, and this system will be established, and I will have a further understanding of RPC.
Without further ado, let’s cut to the chase!
Simply think about the process
Before analyzing Dubbo’s service invocation process, let’s think about what steps a call process would take if we implemented it ourselves.
First we know the address of the remote service, then all we need to do is tell the remote service the details of the method we want to call, and let the remote service parse this information.
It then finds the corresponding implementation class based on this information, makes the call, and then returns the original way after the call, and then the client parses the response and returns.
Call specific information
What information should the client tell the server?
First of all, the client must tell you which interface of the server to call. Of course, it also needs the method name, the parameter type of the method, the parameter value of the method, and there may be multiple versions, so it also needs the version number.
By such a few parameters, then the server can clearly know which method the client is calling, can be precisely called!
Then assemble the response and return it. I’ll post an example of the actual call request object here.
Data is the data I mentioned, the other is the framework, including protocol version, call method and so on, which will be analyzed below.
The general meaning of this is clear, is the ordinary remote call, tell the request parameters, and then the server parses the parameters to find the corresponding implementation call, and then return.
Landing call flow
The above is the imaginary call flow, the real call flow is not so simple.
First of all, the remote calls need to define the protocol, which is to say what kind of language we’re going to speak, and make sure that both sides understand it.
For example, I can speak English and Chinese, and you can also speak English and Chinese. We need to make an agreement between us. For example, we should choose one language to talk in Chinese.
That’s because your brain is smart enough to recognize the language of communication. Computers are not smart enough to recognize the language of communication.
That is, the computer is rigid, our program tells it what to do, and it does what it does.
You need a protocol
So the two parties need to define a protocol first, so that the computer can parse the correct information.
There are three common protocols
There are three types of protocols at the application layer: fixed length, special character partition, and Header +body.
Fixed-length: Indicates that the length of the protocol is fixed. For example, a protocol unit contains 100 bytes. Then the protocol will be parsed after 100 bytes are read.
Advantage is high efficiency, no brain to read a certain length of the analysis.
The disadvantage is rigid, each length can only be fixed, can not exceed the limit of length, and short also have to fill, in THE RPC scene is not appropriate, who knows how long parameter what to want, fixed length waste, fixed short enough.
Special character partition: Defines a special end character to determine the end of a protocol unit according to the special end character, such as a newline character.
The advantage of this protocol is that it is free in length and truncated according to special characters. The disadvantage is that it needs to read until a complete protocol unit is read before parsing. Then, if the transmitted data is mixed with this special character, it will be wrong.
Header +body: if the header is of a fixed length, then the header will fill in the length of the body. The body is not of a fixed length, so that the scalability is better. You can parse the header first, then get len of the body based on the header, and then parse the body.
The Dubbo protocol is in the form of header+body and has a special character 0xdabb, which is used to solve the problem of sticky packets on the TCP network.
Dubbo agreement
Dubbo supports many protocols, so let’s briefly analyze the Dubbo protocol.
Protocol is divided into protocol header and protocol body. You can see that the 16-byte header mainly carries the magic number (0xdabb), then some request Settings, the length of the message body and so on.
After 16 bytes, the protocol body includes the protocol version, interface name, interface version, method name, and so on.
In fact, the protocol is very important, because you can learn a lot of information from it, and only when you understand the content of the protocol, you can understand what the encoder and decoder are doing. Let me take a screenshot of the explanation of the protocol on the official website.
A convention serializer is required
The network is transmitted as a byte stream. Compared to our object, which is multidimensional and the byte stream is one-dimensional, we need to compress our object into a one-dimensional byte stream and transmit it to the peer end.
The peer side then deserializes the byte streams into objects.
Serialization protocol
As you can see from the protocols shown above, Dubbo supports a wide variety of serializations. Instead of looking at each protocol, I’m going to look at the types of serializations.
Serialization can be divided into two broad categories, one is character type, the other is binary stream.
The character type is XML, JSON, the advantage of the character type is convenient debugging, it is friendly to people, we can see the field corresponding to which parameter.
The disadvantage is that the transmission efficiency is low and there are many redundant things, such as JSON parentheses. For network transmission, the transmission time is longer and the bandwidth is larger.
Another big category is binary streaming, which is machine-friendly because its data is more compact, so it takes up fewer bytes and transfers faster.
The disadvantage is that debugging is difficult, the naked eye is not recognized, must borrow special tools to transform.
I won’t go into the deeper level of serialization, but there are a lot of ways to serialize, and we’ll talk about that at a later opportunity.
Dubbo uses the Hessian2 serialization protocol by default.
Therefore, the actual landing also needs to agree on the protocol first, and then choose a good serialization method to construct the request after sending.
Rough call flow chart
Let’s take a look at the picture on the official website.
A brief description is that the Client initiated the call, the actual call is the proxy class, the proxy class finally call Client (default Netty), need to construct a good protocol header, and then Java objects serialized to generate the protocol body, and then the network call transmission.
After receiving the request, the server NettyServer distributes it to the business thread pool, which calls the specific implementation method.
But that’s not enough, because Dubbo is a production-grade RPC framework that needs to be more secure and robust.
Detailed call flow
We have already analyzed that the client also serializes the construction request, so we have omitted this step to make the diagram more focused, as well as the step of responding back.
It can be seen that the production level is stable, so there will often be multiple servers, multiple servers will have multiple Invokers, and finally need to pass routing filtering, and then through the load balancing mechanism to select an Invoker to call.
Of course, clusters also have fault tolerance mechanisms, including retry and so on.
The request will first arrive at Netty’s I/O thread pool for read/write and optional serialization and deserialization, which can be controlled by decode.in. IO. Then the deserialized object will be processed through the business thread pool and the corresponding Invoker will be called.
Call flow – Client source code analysis
The client calls the code.
String hello = demoService.sayHello("world");
Copy the code
Calling the specific interface invokes the generated proxy class, which in turn generates a RpcInvocation object that calls the MockClusterInvoker#invoke method.
The generated RpcInvocation shows the method name, parameter class, and parameter value.
Then let’s take a look at the MockClusterInvoker#invoke code.
This.invoker.invoke implementation actually calls AbstractClusterInvoker#invoker.
Template method
This is actually one of the most common design patterns, the template approach. If you read the source code often, you know that this design pattern is really quite common.
Template method is set in the abstract class code execution skeleton, then specific implementation delay to subclass, defined by the subclass from the personalized implementation, that is to say, can not change the total step modify the realization of the step inside, reduced the duplicated code, is conducive to expand, conform to the principle of open and close.
In the code that doInvoke is implemented by a subclass, some of the above steps are done by each subclass, so pull into the abstract class.
Routing and load balancing get Invoker
The Invocation of the list(Invocation) is based on the invocation of the Invoker and the invocation of the MockInvoker.
Then carry out a wave of loadbalance selection with these invokers and get an Invoker. By default, we use FailoverClusterInvoker, which is the fault tolerant mode of automatic failure switchover. In fact, routing, cluster and load balancing are independent modules. There’s still a lot of content to cover if you expand it out, so you need to start another one, and this one will use it as a black box.
To summarize, FailoverClusterInvoker gets the Invoker list returned from Directory and after routing, it tells LoadBalance to select an Invoker from the Invoker list.
Finally, FailoverClusterInvoker passes parameters to the invoke method of the selected Invoker instance to make a real remote call. Let’s briefly look at FailoverClusterInvoker#doInvoke. I cut out a lot of methods to highlight the point.
The invoke that initiates the call calls invoke from the abstract class and then calls the doInvoker of the subclass. The method in the abstract class is very simple and I won’t show it.
Three ways to call
As you can see from the above code, there are three types of calls, oneway, asynchronous and synchronous.
Oneway is a very common way to send a request when you don’t care if it is successfully sent. It is the least expensive way to send a request.
Async call, Dubbo is asynchronous by nature, and you can see that when the client sends the request it gets a ResponseFuture, and then it wraps the future into the context, so that the user can retrieve the future from the context, The user can then call future.get after a wave of operations and wait for the result.
The Dubbo framework helps us asynchronously transfer to synchronization. As you can see from the code, future.get is called in Dubbo source code, so it gives the user the feeling that after I called the method of this interface, it blocked and had to wait for the result to return. So it’s synchronous.
As you can see, Dubbo is asynchronous in nature, and the reason it’s synchronous is because the framework does it for you. The difference between synchronous and asynchronous is whether future.get is called in user code or in framework code.
Back in the source code, currentClient.request is the following: assemble the request and construct a future and then call NettyClient to send the request.
Let’s look inside the DefaultFuture. Have you ever wondered, because it’s asynchronous, how do you find the corresponding future when the response comes back after the future is saved?
Here’s the secret! It’s using a unique ID.
As you can see, the Request generates a globally unique ID, and the Future internally stores itself and its ID into a ConcurrentHashMap. After sending the ID to the server, the server will return the ID, so that the future can be found in the ConcurrentHashMap, and the connection is correct and complete!
If we look at the code that finally receives the response, it should be clear.
Let’s look at what the next response message looks like:
Response [id=14, version=null, status=20, event=false, error=null, result=RpcResult [result=Hello world, response from provider: 192.1681.17.:20881, exception=null]]
Copy the code
As you can see from this ID, you end up calling the method DefaultFuture#received.
And just to make it a little bit clearer, let me draw another picture:
At this point, the main process of client invocation is almost clear, but there are more details, which will be covered in a later article, otherwise it will be too messy.
The call chain that initiates the request is shown below:
The call chain that handles the request response is shown below:
Call flow – server-side source code analysis
After receiving the request, the server parses the request to get the message, which has five distribution strategies:
The default is all, which means that all messages are dispatched to the business thread pool. Let’s look at the implementation of AllChannelHandler.
The message is encapsulated as a ChannelEventRunnable and thrown into the business thread pool for execution. The ChannelEventRunnable will call the corresponding processing method based on the ChannelState. In this case, channelState.received. So we call handler.received, and we’ll end up calling HeaderExchangeHandler#handleRequest, so let’s look at this code.
The key point of this wave is that the constructed response is stuffed with the request ID, so let’s see what this reply does.
The final call is already clear. It actually calls a Javassist generated proxy class, which contains the actual implementation class. We’ve already looked at the getInvoker method, and we’ll look at how to find the invoker based on the requested information.
The key is the serviceKey. Remember that the service exposure encapsulated the Invoker as a exporter and then constructed a serviceKey that exporterMap and exporter its exporter.
The Key looks like this:
Find the Invoker and finally call the method that implements the class and return the response and that’s the end of the process. Let me just add to the diagram.
conclusion
Let me summarize the call process again today and I think that’s pretty much it.
First, the client invokes some method of the interface, which actually invokes the proxy class, which gets a bunch of invokers(if any) from Directory via cluster. Router filtering is performed (mockInvoker is added for service degradation), and loadBalance is obtained through SPI for a wave of load balancing.
It is important to note that the default cluster is FailoverCluster, which performs fault-tolerant retry processing and will be examined in detail later.
Now that we have the invoker corresponding to the remote service to be called, we construct the request header according to the specific protocol, and then serialize the parameters according to the specific serialization protocol after the construct is inserted into the request body, and then initiate the remote call through NettyClient.
After receiving the request, the server NettyServer obtains the information according to the protocol and deserializes it into an object. Then it sends the message according to the distribution policy (default is All) and throws it to the service thread pool.
The business thread will determine the message type and retrieve serviceKey from the exporterMap generated by the previous service exposure, and then invoke the actual implementation class.
Because both the request and response have a uniform ID, the client finds the stored Future based on the ID of the response, inserts the response and wakes up the thread waiting for the Future, completing a remote call.
But also talked about the template method of this design pattern, of course, in fact hidden a lot of design patterns in which, such as responsibility chain, decorator and so on, not deliberately picked out, the source is too common, basically everywhere.
omg
After today’s article I believe that you have a better understanding of RPC calls, the previous service exposure and service introduction were to complete remote calls.
As for the part of codec, I only mentioned the specific protocol and did not analyze it from the perspective of source code, because that part of the code is actually written and obtained according to the protocol, which is quite rigid. Interested students can study by themselves. Some Buffer operations still need some basic knowledge.
The main line is basically finished, but there are still routing, clustering and other mechanisms that have not been analyzed in detail, which is also very important. The comparison of production environment is basically cluster deployment, so there is no stupid person is a single, so Dubbo’s support for this aspect is also very important, and we will analyze it later.
I’m Aobing, the more you know, the more you don’t know, thank you for your talent: likes, favorites and comments, we’ll see you next time!
This article is constantly updated. You can search “Santaizi Aobing” on wechat and read it for the first time. Reply [Information] There are the interview materials and resume templates for first-line big factories prepared by me.