RustCon Asia '| Distributed Actor System in Rust

Author introduction: Zimon Dai, Rust Development engineer of Aliyun City Brain.

This article is based on Zimon’s talk at the first RustCon Asia conference.

Hello, everyone. What I share today is the Distributed Actor System our team is working on. First I want to say what this Talk is “not” about, because many people may have some misunderstandings when they see this title.

First, we won’t go into detail about how a complete Actor System is implemented, because there are well-established standards for Actor systems, such as Java’s Akka and Rust’s Actix, which are mature libraries, so it doesn’t make much sense to talk about them here. Second, we don’t compare and compete with other popular Rust Actor systems. Probably one of the reasons a lot of people do Rust development is because Rust servers are way ahead of Techpower’s Benchmark, like Actix from Microsoft, and we think Actix does really well and we don’t have to build our own. Third, we’re not going to talk about specific features, because the library is not open source right now, but that’s what we’re planning for this year.

This Talk focuses on the following directions (see Figure 2), which are common problems encountered when we make an Actor System or implement something with similar ideas of Actor System.

I’ll start with a brief introduction to TypeId and Proc Macros for Compilation-stable, and then share a Rust Feature called Specialization that currently doesn’t have stables. Finally, we will introduce how to make a Tick-based Actor System. If you are engaged in game development or have a front-end background, you will understand the concept of Tick. For example, if you are making a game, there is a frame rate, and you need to make 60 frames, each frame is about 16 milliseconds. So this is a Tick; Each front-end Interval has a fixed duration, say 5 milliseconds, which is a Tick.

1. The TypeId Problem

Let’s start with TypeId. In Figure 3, for example, we now have two actors, which may be at different nodes in a distributed system, for network transmission. At this point you can think of a very simple way: Actor A sends A Message through Broker A of the machine. This Message is requested through the network to Broker B, which turns the Buffer into A Message to target Actor B. This is A common network communication.

But there’s a problem with that. For example, when we’re trying to communicate online, we’re actually compiling it into a Buffer with no information, a Vec. Message itself is typed (because Rust is a strongly typed language, and everything in Rust is typed). How do I erase this information and then restore the type when it reaches the target Actor? That’s what we’re going to do today with TypeId.

1.1 Common Solutions

A common solution to this problem is to add the type description of each message to the header of the message. Here is some pseudocode I wrote:

The most important thing is the first field, called type_uid, which is the payload in this Message. If we give each Message type in the Actor System a unique TypeId, we can guess what the payload of this Message is based on the TypeId. The second field is the receiver, which is the address of a target. The third is a Buffer, which is passed serialization.

Now let’s focus on a smaller, specific problem: How do we give each message type a unique TypeId? Rust just happens to have one thing that does this — STD ::any:: any (Figure 6).

All types in Rust implement the Any Trait, which has a core method called GET _type_id, which was stable last week. Calling this method on any type gives you a unique TypeId that contains a 64-bit integer.

Now that you have TypeId, you can think about what kind of requirements are there for TypeId? Here are some of the most important things:

First, the TypeId should be consistent for all nodes. For example, if you have a message type, TypeId is 1, but the integer 1 in another node may represent a different message type, and decoding errors will occur if the message is decoded according to the new message type. So we want the TypeId to be stable throughout the Network. This results in the fact that we cannot use TypeId provided by STD. Because unfortunately STD’s TypeId is tied to the compiled process, a new TypeId will be generated each time you compile, meaning that if software deployed across the network happens to come from two different Rust compilations, the TypeId mismatch will occur.

This leads to a problem: updating even a small component may require recompiling the entire network, which is an exaggeration. So we now solve this problem by using the Proc Macro to get a stable TypeId.

1.2 Proc Macro

In fact, this has been a long-standing question in the community since around 2015, especially for many game programmers, because identity in the game requires a fixed TypeId.

How to solve this problem? It’s very simple, in a very rude way: If we can know each message name name, can give each a fixed integer id name points, and then put the combination into a file, compile time every time to read this file, so we can ensure each time is fixed inside the generated code into an integer, so the TypeId is fixed.

How do we read a file at compile time? The only way to do this is with the Proc Macro. Let’s see here we define (Figure 9) a TypeId of our own:

UniqueTypeId The Trait has only one method, which is to get a type-uid, equivalent to Any of STD. Struct TypeId has only one field inside, an integer t, TypeId is equivalent to STD TypeId.

The top half of Figure 10 has a Message called StartTaskRequest, which is the Message we will use. Then we write a customer derive on it. The bottom part of Figure 10 is the Proc Macro we wrote when we actually implemented it. You can see that we actually implemented the Trait called UniqueTypeId using quote. And then this type_uid method in there that he returns TypeId is actually written dead. The value of this t is # ID, a variable that can be read fixatively from the file during the customer derive writing process.

This way, we can generate the code in a fixed way, writing the Type, namely the integer, every time. A lot of customer derive may be just to simplify the code. But fixing TypeId is something you definitely can’t do without the Proc Macro and Customer Derive.

We can then solve this problem by specifying a fixed file locally, such as.toml (bottom right corner of Figure 10), with a fixed TypeId for each message type in it.

Once you have a fixed TypeId, you can use it to erase types in Rust. This can be done using serde or Proto Buffer. Serialize TypeId to a Buffer, and deserialize Buffer to a specific Type.

The Buffer header’s signature is used to guess the Type of the Buffer header. This method in general feels a lot like Reflection in Java, which dynamically determines the specific type of a Buffer. You might write code to determine what the TypeId of this message is (see Figure 12), such as whether it is the TypeId of PayloadA first, and if not, whether it is the TypeId of PayloadB… Keep going down, but you’re going to be writing a lot of code, and you’re going to have to match all the types. How to solve this problem? Again, we’ll use the Proc Macro to do this.

In Figure 13, we define a message called handle_message inside the Actor, which is actually a Macro. This Macro will repeat these if and else judgments based on all the message types you registered when writing this Actor.

We end up with a very simple Actor architecture (Figure 14). Here we write a Sample Actor for example. First of all, you need customer derive Actor, which will help you realize the Trait of Actor. #[Message(PayloadA, PayloadB)] indicates that SampleActor receives PayloadA and PayloadB. When implementing the Actor Trait, Customer derive will write the if and else type matching completely, and then only need to implement a Handler class to write the method of message processing. In this way, the entire program architecture will be very clear.

In general, with the Proc Macro we can have a very clean, self-explaining Actor Design that completely separates the declarations of actors from the actual message processing. The most important thing is that we can hide all unsafe Type casting behind to provide users with a secure interface. And this running loss is going to be very low, because you’re doing integer comparison.

2. Specialization

My second topic is to introduce Specialization. This is a Feature of Rust that has not yet entered Stable and that many people may not know about. It is an important Feature in the direction of traits.

There is a particular problem in Figure 16. If a Message has multiple encoding schemes, such as Serde’s popular bincode (encoding a struct into a Buffer), and many people use proto-buffer, then if a Message comes from a different encoding scheme, How do you use the same API to decode different messages?

There is a new RFC#1212 called Specialization that provides two main features: first, it allows the functional implementations of traits to overwrite each other, and second, it allows a default implementation of traits.

For example, we first define a Payload (see Figure 18). This Payload must support Serde Serialization and Deserialization. The method of Payload is also a conventional method. Serialize and Deserialize. Most importantly, by default, if a message only supports Serde encoding and decoding, we call bincode.

We can then write an implementation (Figure 19) that starts with a Default, which will call Default if a struct is supported by these traits. If a Trait is added, the new method of the extra Trait is used. This will allow you to continue to support more coDecs by limiting more scope.

This Specialization feature is now available only at nightly, and you just need to open a #! [feature(Specialization)] can be used.

3. Tick-based actor system

Let’s take a look at the tick-based actor system, which is how we implement the Tick on a Tokio based actor system. We all know that Tokio is an asynchronous architecture, but we want to make it Tick based.

What are the benefits of Tick? First of all, Tick is going to be used for a lot of things, including things like game design, Dataflow, Stream computation, and JavaScript apis, which have a little bit of a Tick feel to it. If the whole logic is Tick based, the logic and wait mechanism will be much simpler, and event hooks can also be done.

It’s actually quite simple. We can design a new struct, such as WaitForOnce in Figure 21, which first declares a deadline, meaning how many ticks I must receive a message, and then submits the message’s signature. When we use Tokio for Network IO, we can generate a stream and Tick 1 each time the stream is output. We only need to maintain a concurrent SkipMap. Then register every Tick waits. When the Tick is reached, if the Tick has all the waits covered, you can release the feature and resolve the problem.

In addition, the Tick can also be used to do some things that are not included in the spec of actor system.

For example, as shown in Figure 22, the first point is that actor systems rarely allow waiting for other actors, but tick-based architectures do. For example, deadline is set to 1, indicating that the message must be received before the next Tick is executed. In effect, you implement a setup where actors depend on each other for messages. Second, we can also do pre-fetch. For example, we need to grab some resources for pre-storage, but we will not use this resource immediately. In this way, when I actually use these resources, he can get them quickly. For example, after 1000 ticks are set, something must be fetched. In fact, the fetch of this message will have a relatively large time tolerance.

4. To summarize

To summarize some of the features of our Distributed Actor System, it is tick-based and can support many different Codecs through Specialization. We can then achieve a reflection like effect through TypeId. Finally, we plan to open source this actor system around 2019. In fact, many of our systems and online businesses are based on Rust, and we will gradually disclose these things, hoping to have more interaction with the community from this year, and have more things to communicate with people.

On April 23, 2019, the first RustCon Asia hosted by Miape Technology and PingCAP successfully concluded in Beijing. More than 300 Rust enthusiasts from China, the United States, Canada, Germany, Russia, India, Australia and other countries and regions attended the conference. As Rust Asia’s first “large user Base Party”, this conference brought together more than 20 top Rust developer lecturers from home and abroad for one and a half days of fast-paced sharing and two days of hands-on workshops. The content includes Rust’s cross-industry and cross-domain application practices in distributed data storage, security, search engine, embedded IoT, image processing and so on.

Conference Talk video collection

www.youtube.com/playlist?li…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

RustCon Asia ‘| Distributed Actor System in Rust

1. The TypeId Problem

1.1 Common Solutions

1.2 Proc Macro

2. Specialization

3. Tick-based actor system

4. To summarize

RustCon Asia ‘| Distributed Actor System in Rust

1. The TypeId Problem

1.1 Common Solutions

1.2 Proc Macro

2. Specialization

3. Tick-based actor system

4. To summarize

Related Posts

In-depth understanding of Android MTP storage mapping analysis

Mxgraph series [3] : The underlying state tree mxCell

Computer vision beginners, Peking University outstanding students recommended me to learn this way