Author’s brief introduction
I graduated with a bachelor’s degree in 2012 and a master’s degree in 2016. He once worked in IBM China R&D Center, state-owned enterprises, Ant Financial and many other enterprises. I have been involved in Java development for more than 10 years, and now I focus on systematic summary and sharing of relevant knowledge of distributed application architects. I hope to be helpful to the friends who need systematic learning and accumulation of relevant fields.
Writing is not easy, feel good students easily point a thumbs-up, transparent knowledge let more need to see friends!
RPC calls
RPC calls remote functions as if they were local. When called remotely, the body of the method we need to execute is on the remote machine, that is, Multiply is executed in another process. This raises several new questions:
- Call ID mapping. How do we tell the remote machine that we want to call Multiply instead of Add or Sub? In local calls, the method body is specified directly by the method pointer; we call Multiply and the compiler automatically calls its corresponding method pointer for us. However, in remote calls, method Pointers are not available because the address Spaces of the two processes are completely different. Therefore, in RPC, all methods must have an ID of their own. This ID is uniquely determined in all processes. The client must attach this ID when making a remote procedure call. Then we need to maintain a table of {method <–> Call ID} on the client side and on the server side. The tables do not have to be identical, but the same method must have the same Call ID. When the client needs to make a remote Call, it looks up the table, finds the corresponding Call ID, and passes it to the server, which also looks up the table to determine which method the client needs to Call, and executes the code for that method.
- Serialization and deserialization. How do clients pass parameter values to remote functions? In local calls, we just push the argument onto the stack and let the thread read it. However, in remote procedure calls, the client and server are different processes and cannot pass parameters through memory. Sometimes the client and server even use different languages (such as C++ on the server and Java or Python on the client). In this case, the client needs to convert the parameter into a byte stream, pass it to the server, and then convert the byte stream into a format that it can read. This process is called serialization and deserialization. Similarly, the value returned from the server also needs to be serialized through deserialization.
- Network transmission. Remote calls are often used over a network, where clients and servers are connected. All data needs to travel over the network, so there needs to be a network transport layer. The network transport layer passes the Call ID and serialized parameter bytes to the server, which then passes the serialized Call result back to the client. Anything that can do both can be used as a transport layer. Therefore, the protocol it uses is actually unlimited, as long as it can complete the transfer. While most RPC frameworks use TCP, UDP works as well, and gRPC uses HTTP2. Java’s Netty also falls into this category.
With these three mechanisms, RPC can be implemented as follows:
// Client 1\. Maps this Call to a Call ID. This assumes the simplest string as the Call ID method 2\. Serializes the Call ID argument. You can directly package their values in binary form 3\. Send the packets from 2 to ServerAddr using the network transport layer 4\. Wait for the server to return result 5\. If the server call is successful, deserialize the result. // Server 1\. Maintains a local mapping of Call ID to function pointer call_id_map 2\. Wait for request 3\. To get a request, deserialize its packet to get Call ID 4\. By looking in call_id_map, we get the corresponding function pointer 5\. After deserializing the arguments, the Multiply function is called locally, resulting in 6\. Serialized results are returned to the Client over the networkCopy the code
Therefore, to achieve an RPC framework, in fact, only need to follow the above process is basically completed.
Among them:
- Call ID mappings can use function strings directly or integer ids. A mapping table is generally just a hash table.
- Serialization Deserialization can be written yourself, or it can be done using something like Hession.
- The network transport library can write its own socket, or use Netty and so on.
Of course, there are some details that can be filled in, such as how to handle network errors, how to prevent attacks, how to do traffic control, and so on. But with the architecture above, these can be added continuously.
The previous blog on TCP transport, serialization, and deserialization has been written in detail for those who are interested.
Microservice invocation
In a microservice architecture, many services need to be invoked to complete a function. How services call each other becomes a key issue in microservices architecture.
There are two ways of service invocation: one is RPC; The other is an event-driven approach, that is, sending messages.
The RPC mode can be divided into synchronous RPC call mode and asynchronous RPC call mode. The selection principle is as follows:
- If the upstream calls to an RPC interface do not need to care about the return value of the interface, then asynchronous RPC calls can be used. That is, the downstream does not need to process the request in real time after receiving it. The downstream immediately returns the result of successful processing to the upstream.
- If the return value of the RPC interface is dependent, that is, the downstream server needs to process and return the result immediately, the synchronous RPC call mode is adopted.
Both asynchronous RPC mode and asynchronous messaging mode implement asynchronous invocation of services. What is the difference between the two?
In most cases, the two can be used interchangeably. However, asynchronous RPC calls focus on one-to-one single point communication, while asynchronous messaging is more suitable for one-to-many broadcast calls.