This article was written this time last year and has been sitting in the corner of the computer. The last time I thought of this article and wanted to share it, I couldn’t find it. I found it this time and felt it was a pity not to share it.

This article introduces the requirements, design process, implementation process and mental process of developing an RPC framework.

When we wanted to develop an RPC framework, it was just fun and WE found THAT RPC was a good fit for our service. I went through all the frameworks of the Go ecosystem at the time and found that they were either too heavy or the API wasn’t good enough, making it relatively difficult to write. So I developed this RPC framework.

The initial stage of the development framework was quite difficult. Due to technical level and various reasons, the whole framework was rewritten several times before the first version was delivered to production environment.

Right now our entire back end is based on this RPC framework to build services. Admittedly, there are still a lot of issues with this framework, and the next version is expected to be released early next month.

since

In distributed computing, Remote Procedure Call (RPC) is a computer communication protocol. This protocol allows a program running on one computer to call a subroutine in another address space (usually a computer on an open network) without the programmer having to program for this interaction as if it were a native program (no attention to detail). RPC is a Server – Client (Client/Server) mode. The classical implementation is a system of sending request – receiving response for information exchange. (from Wikipedia)

In common usage scenarios, I prefer to refer to RPC as “remote function calls.” This is a standard C-S model, in which service providers register their own services, which are parsed by various frameworks, and then clients invoke various services in a convenient way.

Why do we use RPC?

This is a question that confuses many people. Why use RPC when we already have HTTP as a convenient way to retrieve other services? Isn’t that over-engineered?

We start with two requirements:

In the development of wechat public accounts, various automatic reply functions and menu clicks need different processing logic, which are usually unrelated to each other. Business is getting bigger and bigger, we need to reply more and more things in the personal information interface, which usually does not belong to a module, need a convenient function to pull the personal information of other modules.

If we put all of our autoreply into one service, as most businesses do, coupling will inevitably increase. As the business gets bigger and bigger, we need to add, remove, or modify an auto-reply feature and change the entire code. At this point we have a simple, low-coupling solution: RPC

Let’s start with a simple demo:

Provision of services:

package main

​import (
    "github.com/MashiroC/begonia-rpc/entity"
    "github.com/MashiroC/begonia-sdk"
)

​type MathService struct {}​

type HelloService struct {}​

func (s *MathService) Sum(a, b int) (res int) {
    return a + b
}

​​func (h *HelloService) Hello(name string) (res string) {
    return "Hello " + name
}

​func main(a) {
    cli := begonia.Default(": 8080")
    cli.Sign("Hello", &HelloService{})
    cli.Sign("Math",&MathService{})
    cli.KeepConnect()
}​Copy the code

Call a remote function:

package main
​import (
    "fmt"
    "github.com/MashiroC/begonia-rpc/entity"
    "github.com/MashiroC/begonia-sdk"
)
​func main(a) {

    // get a begonia client
    cli := begonia.Default(": 8080")

    // get a service
    helloService := cli.Service("Hello")

    // get a sync Function
    hello := helloService.FunSync("Hello")

    // call it!
    res, err := hello("MashiroC")
    fmt.Println(res, err)    
    // Hello Mashiroc <nil>

    // get a async function
    helloAsync := helloService.FunAsync("Hello")

​    // call it too!
    helloAsync(func(res interface{}, err error) {
        fmt.Println(res,err)
        }, "MashiroC")}Copy the code

The above code may confuse you, as this is not the way any RPC framework you have ever seen written. The one used here is Begonia-RPC, a lightweight RPC framework that I wrote myself in some time. I’m going to start with how to use it and go all the way to how to write an RPC framework yourself.

said

Now, we will talk about the implementation of an RPC framework from the perspective of the user, designer, and developer.

As user

As a user, I focus on three things:

  • Is the API simple in design and cumbersome to call

  • Whether the configuration is complex and the function is powerful

  • Is it efficient enough

The vast majority of developers do just that

The user

Of course, there will be developers who want to know your code and implementation so that they can understand the techniques you use.

Based on my own experience, I designed a set of apis above.

The begonia.default (addr String) is used to connect to the begonia-RPC service center

We use a client.service (name String) to get a Service, and then use service.funsync (name String) to get a synchronization remote function. Here we get the remote function we can call directly, the call here does not separate the local namespace function and the remote address function call, is really “remote function”.

When we call hello(), we return two values, interface{} and error. The first value is the result of our remote call, and the second value is the error that may have occurred in the call. The return value can be:

func Add(a, b int) (res int) {
    return a + b
}Copy the code

The simplest return value type, with a single result that will be the first return value of the local call.

func Divide(a, b int) (res int, err error) {
    if b == 0 {
        err = errors.New("Divided cannot be zero")
        return
    }
    res = a / b
    return
}Copy the code

The most common type of return, which returns a result with a possible error. If no errors occur during the call, the two return values of the remote call will be the result of the local call.

func Mod(a ,b int) (res int, m int) {
    res = a / b
    m = a % b
}Copy the code

When a function has more than one return value except error, we pass all the return values except the last error to the function as a []interface{} return value.

For Golang’s SDK, begonia-RPC can pass integers, floating point numbers, strings, arrays, slices, constructs, maps.

If the return value of a remote function is a struct, the local deserialization converts it to a map[string]interface{}. To facilitate struct support, there is an API that binds the return value of a function to a struct:

Suppose we have a structure called Person and a configured remote function called TestPerson() that gets an instance of Person.

    var per Person
    iferr := begonia.Result(TestPerson()).Bind(&per) ; err ! =nil {
        log.Fatal("rpc call error!", err.Error())
    }Copy the code

In fact, this is the second SET of apis I have designed. The first set of apis is so ugly and uninspiring that I won’t spend any time on them. I’m also considering renaming my usual FunSync function to Fun. Next I would consider using Google’s ProtoBuf and make a code generation tool to easily build an RPC client.

As a designer

Now, as a designer of RPC framework, we design the overall function and architecture.

Most RPC frameworks now provide producer-consumer correspondence, where a service listens on a port and then the caller needs to know the other’s port (gRPC does this), and each service is based on a port listener. I was tempted to write my own RPC framework because gRPC didn’t provide the requirements I wanted and ProtoBuf was a hassle to write.

Our design objectives are:

  • Provides unified monitoring and scheduling services.

  • The producer and consumer are no longer antagonistic, each service can be both a consumer and a producer, and each producer process can provide multiple services.

  • Provide a concise API to call services quickly and easily.

Next, I designed a single service center architecture:



We use a centralized service center to monitor and schedule the overall service. This service center acts as a unified gateway and scheduler for registration services.

Except for the service center, all registered service nodes are horizontal and there is no clear division of clients or servers. We call each service connected to the service center a service node, including the default service Center Service provided by RPC Center, which is a horizontal service node.

Since this service center is the unified gateway for services, all service invocation requests will be sent to the service center first. The service center receives the function call request and forwards it to the network address of the corresponding service and waits for a response. When the server receives the response, it forwards it back to the client process. This is a standard Request/Response model.

Our process for sending a remote function call request is as follows:



According to the above model, it is not difficult to find that HTTP is the most convenient and effective protocol, but we do not use HTTP, we use TCP for communication, which has the following considerations:

  • TCP’s full-duplex connection fits the whole model better

  • TCP is more efficient

But using TCP presents two problems:

  • TCP is a typical asynchronous communication model, whereas our service model is a synchronous model.

  • TCP Socket programming is almost uncharted territory for me.

The second problem is actually very easy to solve, buy a book incidentally write a few months is not a problem (escape

When we use TCP, our request flow for remote function calls will change a bit



In the figure above, the purple part is the basic TCP connection, and all the lines across the swimlane represent the data transfer.

The blue portion from client to server is the request flow of the remote function, and the green portion in reverse order is the response flow of the remote function.

We used TCP and decided to format packets in binary rather than text format, so we called each packet a frame, which is divided into request frames, response frames, and so on. We’ll talk about that in a second.

Although all service nodes on begonia-RPC are parallel clients of the service center, when a function call occurs, we refer to the caller as the client and, similarly, the called party as the server.

We can see from the figure that the client waits after sending the request frame until it gets its response or times out.

Having completed the basic design of an RPC framework, we now need to add more functionality to it.

In fact, I haven’t finished most of the following functions yet, but I will definitely write them out when I write the blog (x)

authentication

At present, the whole RPC we designed is running naked on the network, which is a big problem. We should not allow any service to call and register services without authentication, which is equivalent to leaving the door at home open. The authentication method includes the following two schemes. I have not chosen which one to use.

  • Set a token for invoking the service and registering the service respectively. If the token is not verified after establishing a connection, the connection will be considered unavailable when performing corresponding operations.

  • Each client configures a token before the connection is established. All data transmitted after the connection is established is encrypted using packet encryption. If the server fails to decrypt the data, the connection is unavailable.

tag

Back to the original requirement, we wanted to decouple the various modules in the business and make it easy to pull personal information from each module. However, there is a problem: the RPC we designed above does not provide such a requirement to register a service with the same name. Here we can use the tag function to tag a function of a Service, and then pull all functions that have a tag from the Center Service, calling them one at a time. Of course we should have a convention for returning data with the same tag.

Service monitoring

This is a feature that most frameworks provide and I won’t go into detail here.

Service governance

Same as above, no further elaboration.

The final

We now have a basic RPC framework design. Of course, it’s just design now, and design also needs to be implemented through code. The next few articles will explain how our design is implemented from a different Angle (and I really did it). There’s a lot more to discuss from a developer’s perspective: feature completion, efficient optimization, error handling, disaster recovery, and more.