GitHub github.com/wangzhiwubi…

The implementation principle of RPC

As we discussed in the previous two lectures, RPC is primarily designed to solve two problems: to solve the problem of inter-service invocation in distributed systems. The remote call should be as convenient as the local call, so that the caller is not aware of the logic of the remote call. Again using Calculator as an example, if the implementation class CalculatorImpl is placed locally, call it directly:

Now that the system is distributed and the CalculatorImpl and caller are not in the same address space, a remote procedure call must be made:

So how to implement remote procedure call, also known as RPC, a complete RPC process, can be described by the following diagram:

The Client on the left corresponds to Service A, and the Server on the right corresponds to Service B. Let’s explain it step by step.

  • In the application-layer code of Service A, the Add method of an implementation class of Calculator is called, hoping to perform an addition operation.
  • The Calculator implementation class does not directly realize the Calculator’s addition, subtraction, multiplication and division logic, but obtains the calculation result by remotely calling the RPC interface of Service B, so it is called Stub.
  • How does Stub establish remote communication with Service B? The run-time Library will help you implement remote communication functions, such as the Java Socket Library. Of course, you can also use HttpClient, which is based on the Http protocol, or any other communication tool class. RPC does not specify which protocol you want to use to communicate;
  • Stub establishes communication with Service B by invoking methods provided by the communication tool, and then sends the request data to Service B. Note that since the underlying network communication is in binary format, the data that the Stub passes to the communication class must also be binary, such as Calculator.add (1,2). You must put the parameters 1 and 2 into a Request object. This includes other information such as which RPC interface of which service to call), which is then serialized to binary and passed to the communication utility class, as shown in the code implementation below;
  • The binary data goes to Service B, which of course has its own communication tool that receives binary requests.
  • Since the data is binary, deserialize the binary data into a request object, which is then handed to the Service B Stub for processing.
  • Just like the Service A Stub, it is A “Stub” that parses the request object, knows which RPC interface the caller is calling, and then passes the parameters to the corresponding RPC interface. The actual implementation class of Calculator to execute. Obviously, if it’s Java, reflection is used here.
  • After the RPC interface is executed, the result is returned. Now Service B needs to send data to Service A. Service B becomes A Client and Service A becomes A Server. Service B deserialization execution result -> transfer to Service A->Service A deserialization execution result -> return the result to Application.

The theory is done, it’s time to put it into practice.

Turn theory into practice

First, the application layer of Client side how to initiate RPC, ComsumerApp:

public class ComsumerApp { public static void main(String[] args) { Calculator calculator = new CalculatorRemoteImpl(); int result = calculator.add(1, 2); }}Copy the code

With a CalculatorRemoteImpl, we encapsulate RPC logic so that clients don’t feel the hassle of remote calls. Now let’s look at CalculatorRemoteImpl, it’s a little bit too much code, but it’s just the above steps 2, 3 and 4 in code, CalculatorRemoteImpl:

public class CalculatorRemoteImpl implements Calculator {
    public int add(int a, int b) {
        List<String> addressList = lookupProviders("Calculator.add"); String address = chooseTarget(addressList); try { Socket socket = new Socket(address, PORT); CalculateRpcRequest = generateRequest(a, b); ObjectOutputStream objectOutputStream = new ObjectOutputStream(socket.getOutputStream()); / / to send the request to the provider objectOutputStream writeObject (calculateRpcRequest); ObjectInputStream = new ObjectInputStream(socket.getinputStream ())); ObjectInputStream = new ObjectInputStream(socket.getinputStream ()); Object response = objectInputStream.readObject();if (response instanceof Integer) {
                return (Integer) response;
            } else {
                throw new InternalError();
            }

        } catch (Exception e) {
            log.error("fail", e); throw new InternalError(); }}}Copy the code

The first two lines of the Add method, lookupProviders and chooseTarget, can be confusing.

In distributed applications, a Service may have multiple instances, such as Service B, with IP addresses 198.168.1.11 and 198.168.1.13. LookupProviders look for a list of instances of the Service to invoke. In distributed applications, there is usually a service registry that provides the ability to query a list of instances.

Which instance to call once you’ve found the list of instances, you only need chooseTarget, which is essentially a load balancing policy.

Since we’re just trying to implement a simple RPC here, we leave the service registry and load balancing out for now, so the code is dead and returns the IP address 127.0.0.1.

As the code continues, we use sockets for remote communication, and writeObject of ObjectOutputStream and readObject of ObjectInputStream for serialization and deserialization.

Finally, let’s look at the server-side implementation, which is very similar to the client-side implementation, ProviderApp:

public class ProviderApp {
    private Calculator calculator = new CalculatorImpl();

    public static void main(String[] args) throws IOException {
        new ProviderApp().run();
    }

    private void run() throws IOException {
        ServerSocket listener = new ServerSocket(9090);
        try {
            while (true) { Socket socket = listener.accept(); Try {// Deserialize the request ObjectInputStream ObjectInputStream = new ObjectInputStream(socket.getinputStream ())); Object object = objectInputStream.readObject(); log.info("request is {}", object); Int result = 0;if (object instanceof CalculateRpcRequest) {
                        CalculateRpcRequest calculateRpcRequest = (CalculateRpcRequest) object;
                        if ("add".equals(calculateRpcRequest.getMethod())) {
                            result = calculator.add(calculateRpcRequest.getA(), calculateRpcRequest.getB());
                        } else{ throw new UnsupportedOperationException(); ObjectOutputStream ObjectOutputStream = new ObjectOutputStream(socket.getOutputStream()); objectOutputStream.writeObject(new Integer(result)); } catch (Exception e) { log.error("fail", e); } finally { socket.close(); } } } finally { listener.close(); }}}Copy the code

The Server side is mainly through the ServerSocket accept method, to receive the Client side of the request, and then is deserialized request -> execute -> serialize the execution results, and finally the binary format of the execution results back to the Client.

In this way we have implemented a crude and detailed RPC.

It’s crude because this implementation really sucks, and why it sucks will be explained in the next section. It is detailed because it demonstrates the execution process of an RPC step by step, which is convenient for everyone to understand the internal mechanism of RPC.

Why does this RPC implementation suck

This RPC implementation is just to show you how RPC works. If you want to use it in a production environment, it will never work.

1. Lack of generality I wrote a CalculatorRemoteImpl for Calculator interface to realize remote call of Calculator. Next time if another interface needs remote call, do I have to write corresponding remote call implementation class again? It must be very inconvenient.

So how to solve it? Let’s take a look at how RPC calls are implemented using Dubbo:

@Reference private Calculator calculator; . The calculator. The add (1, 2); .Copy the code

Dubbo integrates with Spring, and when the Spring container initializes, if it scans an object with the @Reference annotation, it generates a proxy object for that object, which takes care of the remote communication, and then puts the proxy object into the container. So the Calculator used by the code runtime is the proxy object. We can stop integrating with Spring, which means we don’t need to use dependency injection, but how can we do it like Dubbo without having to write proxy objects ourselves? Put the remote call information into an RpcRequest object and send it to the Server. After the Server parses it, it knows which RPC interface you are calling, the type of the input parameter, and the value of the input parameter. Just like Dubbo’s RpcInvocation:

public class RpcInvocation implements Invocation, Serializable { private static final long serialVersionUID = -4355285085441097045L; private String methodName; private Class<? >[] parameterTypes; private Object[] arguments; private Map<String, String> attachments; private transient Invoker<? > invoker;Copy the code

After Spring has implemented the generalization of proxy objects, the next step is to consider integrating the IOC functionality of Spring to create proxy objects through Spring, which requires some knowledge of Spring bean initialization.

3, long connection or short connection can not open a Socket every time to call RPC interface to establish a connection, right? Isn’t it possible to keep several long connections and then, every time an RPC request comes in, put the request on a task queue and let the thread pool consume it for execution? Just a thought, you can refer to how Dubbo is implemented later.

4, Server side thread pool We are now the Server side, is a single thread, each time to wait for a request to finish processing, to accept the connection of another socket, such performance must be very poor, is it possible to use a thread pool, to realize the simultaneous processing of multiple RPC requests? Again, this is just an idea.

As mentioned earlier, to invoke a service, you first need a service registry that tells you what instances the other service has. Dubbo’s service registry is configurable and Zookeeper is officially recommended. If you use Zookeeper, how to register an instance and how to get an instance are all implemented.

6, load balancing how to pick out a number of instances, call, this is to use load balancing. There must be more than one load balancing policy. How to make the policy configurable? How do you implement these strategies? Also see Dubbo, Dubbo – load balancing

7. Does the result cache actually go to the Server every time it calls the query interface? Do you want to consider supporting caching?

8. The multi-version control server interface is modified. What about the old interface?

9, asynchronous call client after calling the interface, do not want to wait for the server to return, want to do something else, can support?

The server is down. How to handle requests that have not finished processing?

There are many more such optimizations, which is why implementing a high-performance and highly available RPC framework is so difficult. Of course, we now have a lot of very good RPC framework can refer to, we can learn from the previous wisdom.

Please stamp: making the original https://github.com/wangzhiwubigdata/God-Of-BigData pay close attention to the public, push, interview, download resources, focus on more big data technology ~ data into the way of god ~ update is expected to 500 + articles, has been updated 50 + ~Copy the code