Remote Procedure Call (RPC) is a computer communication protocol. The protocol allows a program running on one computer to call a subroutine on another without the programmer having to program for this interaction. The primary goal of RPC is to make it easier to build distributed applications, providing powerful remote invocation capabilities without losing the semantic simplicity of local invocation.

While internships before this period of his spare time, I implemented a lightweight distributed RPC framework, called Buddha (https://github.com/tinylcy/buddha), the code quantity is not big, but the sparrow is small but all-sided. This article will explain Buddha’s design, dismantling of the framework components, and the factors to be considered step by step.

Serialization and deserialization

On a network, all data will be converted to bytes for transmission, so at the code level, an RPC framework needs to convert data in a specific format to and from byte arrays. For example, Java already provides a default serialization method, but using Java’s native serialization method can be a performance bottleneck in high-concurrency scenarios. As a result, many open source and efficient serialization frameworks have emerged: Kryo, Fastjson, Protobuf, and others. Buddha currently supports both Kryo and FastJSON serialization frameworks.

TCP Unpacking and sticking packets

Because TCP only cares about byte streams, it does not know the data format at the top. If a large amount of data is sent at the client application layer at a time, TCP will decompose the data and transmit it. Therefore, TCP needs to glue packets to the server to ensure data order. If the client sends a small amount of data at a time, TCP does not immediately send the data. Instead, TCP stores the data in the buffer and sends the data when a certain threshold is reached. Therefore, the server needs to unpack the data.

Through the above analysis, we understand the cause of TCP packet sticking or unpacking. The key to solve this problem is to add boundary information to packets. There are three common methods as follows.

  • The sending end adds a header to each packet, which contains at least the length of the packet. In this way, when receiving data, the receiving end can obtain the effective data length of the packet by reading the length information of the header.

  • The sender encapsulates each packet as a fixed length (filled with extra 0), so that the receiver reads the data of each packet according to the agreed fixed length after receiving the data.

  • A special symbol is used to distinguish each packet, and the receiving end is also the boundary of the partition packet by this special symbol.

Buddha took the first approach to address TCP unpacking and sticky packet problems.

BIO and NIO

BIO is often used in the classic connection-per-thread model. Multithreading is used because functions like Accept (), read(), and write() block synchronously. This means that when an application is single-threaded and performing IO operations, if the thread blocks, the application must be suspended. But the CPU is actually idle. Open multithreading, you can let the CPU to serve more threads, improve CPU utilization. However, in the case of a large number of active threads, the adoption of multithreading model will bring the following problems.

  • Threads are expensive to create and destroy. In Linux, a thread is essentially a process, and creating and destroying a thread is a heavyweight operation.

  • In the JVM, each thread takes up a fixed amount of stack space, and the JVM’s memory space is limited, so if there are too many threads then the threads themselves take up too many resources.

  • The cost of thread switching is high, and each thread switching involves the saving and recovery of context as well as the switching between user state and kernel state. If there are too many threads, a large percentage of CPU time will be spent on thread switching.

The first two problems are solved by using thread pools, but the overhead of thread switching remains. So in high concurrency scenarios, traditional BIO is powerless. The key feature of NIO is that the read, write, register, and receive functions are non-blocking in the wait-ready phase and can be returned immediately, allowing us to make full use of the CPU without multithreading. If a connection cannot read or write, you can log the event and switch to another ready connection for data reading or writing. In Buddha, Netty was used to write NIO programs with a clearer structure.

Service registration and discovery

In practice, RPC service providers often need to use cluster to ensure the stability and reliability of services. Therefore need to implement a service registry, service providers will be currently available registered service address information to the registry, but the client makes a remote call, the first by a service registry to obtain a list of the currently available services, and then get the specific service provider’s address information (the stage can load balancing), Make a call to the service provider based on the address information. The client can cache the list of available services and notify the client when the list of services in the registry changes. Also, the registry needs to be notified when the service provider becomes unavailable. Buddha uses ZooKeeper for service registration and discovery.

Code implementation

Buddha is my learning validation of RPC in the process of the birth of a lightweight distributed RPC framework, the code on the lot (https://github.com/tinylcy/buddha).

reference

  • The RPC model and implement the concept of parsing (http://mindwind.me/blog/2016/05/22/RPC- parsing. The conceptual model and implementation of HTML)

  • NettyRpc (https://github.com/luxiaoxun/NettyRpc)

Source: http://tinylcy.me/2017/07/04/ how to implement a distributed/RPC framework

Copyright notice: The content is from the network, and the copyright belongs to the originator. Unless we can not confirm, we will indicate the author and source, if there is any infringement, please inform us, we will immediately delete and apologize. thank you

-END-

Architecture abstract

ID: ArchDigest

Internet application architecture/architecture technology/large websites/big data/machine learning

For more great articles, click below: read the original article