I am very sorry for the delay because I have been busy with things before and after graduation. The first article after I came back was mainly about What Problems does the actor Model solve? It can briefly and clearly explain the current problems encountered in the field of concurrency and why the Actor model can be used to solve these problems. This paper mainly uses my own understanding to translate this article. Please point out any shortcomings.

What problem does Actor solve?

Akka uses the Actor model to overcome the limitations of traditional object-oriented programming models and to address the challenges posed by highly concurrent distributed systems. A good understanding of the Actor model is required to help us recognize the shortcomings of traditional programming approaches in the realm of concurrent and distributed computing.

Disadvantages of encapsulation

Object Oriented programming (OOP) is a widely used and familiar programming model. One of its core concepts is encapsulation, and it stipulates that the internal data encapsulated by objects cannot be accessed directly from the outside. Only relevant attribute methods are allowed to operate data, such as getX, setX and other methods in Javabean that we are familiar with. Object provides secure data manipulation for encapsulated internal data.

For example, ordered binomial trees must ensure that the tree node data distribution rules, if you want to use ordered binomial trees to query related data, you must rely on this constraint.

When analyzing the behavior of object-oriented programming at run time, we might draw a message sequence diagram to show the interaction during method calls, as shown below:

seq chart

However, the diagram above does not accurately represent the lifeline of the instance during execution. In fact, one thread performs all of these calls, and the operation on the variable is performed on the same thread that called the method. Adding threads of execution to the sequence diagram looks something like this:

seq chart thread

However, when faced with multi-threading, the previous diagram becomes increasingly confusing and unclear. Now let’s simulate multiple threads accessing the same example:

seq chart multi thread

In this case, the above two thread calls the same method, but don’t call the object does not guarantee that what has happened to the encapsulation of data, the method of two calls instruction can either way, there is no guarantee that the consistency of Shared variables, now, imagine this problem under more threads will be more serious.

The most common way to solve this problem is to place a lock on the method. Locking ensures that only one thread can enter the method at a time, but this is a very expensive method:

  • Locking severely limits concurrency and is costly on current CPU architectures, requiring the operating system to pause and restart threads.

  • The caller’s thread is blocked so that it can’t do other meaningful tasks. For example, we want a desktop application running in the background to get a response for manipulating the UI. In the background, thread blocking is completely wasteful, and one could argue that it could be compensated for by starting a new thread, but threads are also a very expensive resource.

  • Using locks leads to a new problem: deadlocks.

These real problems leave us with a choice of two:

  • Locks are not used, but state confusion can result.

  • Use a lot of locks, but can degrade performance and easily lead to deadlocks.

In addition, the lock can only be better use of locally, when our application deployed on different machines, we can only choose to use a distributed lock, but unfortunately, the efficiency of a distributed lock may be several orders of magnitude lower than the local lock, will also have a lot of restrictions on the propagation of the follow-up, the protocol of distributed lock requires multiple machines on the network to communicate with each other, So delays can get very high.

In object-oriented languages, we rarely think about threads or their execution paths. We usually think of the system as a network of many instance objects. Through method calls, the internal state of the instance objects is modified, and the previous method calls of the instance objects drive the entire program to interact:

object graph

Then, in a multithreaded distributed environment, the thread actually traverses the network of object instances through method calls. Thus, threads are method call-driven execution:

object graph snakes

Conclusion:

  • Objects are only guaranteed to encapsulate data correctly in a single thread, which can lead to state confusion in a multi-threaded environment. In the same code segment, two competing threads can cause variable inconsistencies.

  • Using locks may seem like a good way to ensure the correctness of encapsulated data in a multithreaded environment, but it is actually inefficient and can easily lead to deadlocks when the program is actually running.

  • Locks may work fine on a single machine, but they perform poorly and scale poorly in distributed environments.

Sharing has drawbacks in modern computer architecture

In 80-90 – s concept of programming model, write a variable equivalent to put it into memory directly, but in the modern computer architecture, we made some changes, write the corresponding cache rather than directly into memory, most of the local cache cache is CPU core, but consists of a CPU to write cache is not visible to other CPU. Caches must interact in order for changes to the local cache to be visible to other cpus or threads.

On the JVM, we must use volatile flags or Atomic wrapping classes to ensure that memory is shared across threads. Otherwise, we must use locks to ensure that shared memory is correct. So why don’t we make all variables volatile? Because exchanging information between caches is an expensive operation, it implicitly prevents the CPU core from doing other work and causes a bottleneck in the cache consistency protocol (cache consistency protocol) that the CPU uses to transfer cache between main memory and other cpus.

Even if developers are aware of these issues and figure out which memory locations need to be volatile or Atomic wrapped around classes, this is not a good solution and you may not know what you are doing until later in the program.

Conclusion:

  • There is no true shared memory, and the CPU core, like computers on a network, explicitly passes blocks of data (cache rows) to each other. Communication between cpus has more in common with network communication. It is now standard to pass messages through a CPU or network machine.

  • Instead of hiding messaging, use shared memory identifiers or Atomic data structures. A more formal approach is to store shared state within concurrent entities and explicitly pass events and data through messages between concurrent entities.

Disadvantages of the call stack

Today, we often call the stack for task execution, but it was invented in an era when concurrency wasn’t so important because multicore CPU systems weren’t common. The call stack cannot cross threads, so asynchronous calls cannot be made.

Threads in the commissioned background will be a problem, in practice, is the task entrusted to another thread, this is not a simple method calls, but the thread called directly to the local implementation, generally speaking, a caller thread add tasks to a memory location, specific work thread can choose the task execution, in this way, The caller thread doesn’t have to block to do something else.

But there are a few questions here, the first being how does the caller get notified of the completion of the task? A more important question is when a task gets an exception and an error occurs, who handles the exception? The exception will be handled by the worker thread executing the task regardless of which caller called the task:

exception pro

This is a serious problem. How does the thread of execution handle this situation? Executing a task to deal with this problem is not a good solution because it is not clear what the true purpose of executing the task is, and the caller should be notified of what is happening, but there is no structure to solve this problem. If not properly notified, the caller thread will not receive any error messages and even the task will be lost. This is like being on the network and not getting any notification of a failed request or lost message.

In some cases, the problem can get worse, with the worker thread having an error that it cannot recover from. For example, an internal error caused by a bug causes a thread to shut down, which leads to the question, who should restart the thread and save its previous state? On the surface, this problem is solvable, but there is a new possibility that the worker thread will not be able to share the task queue while it is executing a task. In fact, when an exception occurs and a cascade of tasks is uploaded, the state of the entire task queue may be lost. So messages can be lost even when we interact locally.

Conclusion:

  • To achieve any system with high concurrency and high performance, threads must delegate tasks efficiently to other threads so as not to block. This kind of task delegation concurrency is also applicable in distributed environment, but it needs to introduce mechanisms such as error handling and failure notification. Failure becomes part of the domain model.

  • Concurrent systems apply commissioned mechanism need to deal with after service failure also means that need in the event of a failure to restore services, but the actual case, restart the service may be lost, even without this happens, the caller get response may also be because the queue waiting, recycling etc. While under the influence of delay, so, in the real environment, We need to set the timeout for the request response, as in a networked system or distributed system.

Why is the Actor model needed in highly concurrent, distributed systems?

To sum up, the usual programming model is not suitable for modern highly concurrent distributed systems. Fortunately, we don’t have to throw away what we know, and actors help us overcome these problems in a good way, allowing us to implement our systems in a better model.

We focus on the following aspects:

  • Use encapsulation, but not locks.

  • Build an entity that handles messages, changes state, and sends messages to drive the entire program.

  • Don’t worry about mismatches between program execution and the real world.

The Actor model can help us achieve these goals, as described below.

Use messaging mechanisms to avoid using locks to prevent blocking

Unlike method calls, the Actor model uses messages to interact. The way the message is sent does not transform the execution thread of the sender into a specific task execution thread. Actors can continuously send and receive messages without blocking. So it can do more work, such as sending and receiving messages.

In object facing programming, control of the caller’s thread is not released until a method returns. In this respect, the Actor model is similar to the face to object model in that it processes messages and returns to execution when the message is processed. We can simulate this execution pattern:

actor graph

But the biggest difference between this approach and the method call approach is that there is no return value. By sending a message, the Actor delegates the task to another Actor. As we said earlier with stack calls, if you need a return value, then the sending Actor needs to block or be in the same thread as the Actor performing the task. In addition, the receiving Actor returns the result as a message.

The second key change is to continue to preserve encapsulation. Actors process messages just like calling methods, but the difference is that in the case of multi-threading, actors can ensure that their internal state and variables will not be destroyed, and the execution of actors is independent of the Actor that sends the message, and the same Actor only processes one message at a time. Each Actor processes the received messages in an orderly manner, so multiple actors in an Actor system can process their own messages concurrently, making full use of multi-core cpus. Because an Actor can process at most one message at a time, it doesn’t need a synchronization mechanism to guarantee consistency of variables. So it doesn’t need a lock:

serialized timeline invariants

To summarize, the following behavior occurs when an Actor executes:

1.Actor adds the message to the end of the message queue. 2. If an Actor is not scheduled to execute, mark it as executable. 3. An (invisible) scheduler schedules the Actor’s execution. 4.Actor selects a message from the message queue header for processing. 5. The Actor modifies its state during processing and sends messages to other actors. 6.Actor

To implement these behaviors, actors must have the following features:

  • Mailbox (as a message queue)
  • Behavior (as internal state of Actor, processing message logic)
  • Message (request Actor data, which can be thought of as parameter data at the time of method invocation)
  • Execution environment (such as thread pool, scheduler, message distribution mechanism, etc.)
  • Location information (for actions that may occur later)

Messages are added to the mailbox of the Actor, and the Actor’s behavior can be viewed as how the Actor responds to the message (such as sending more messages or modifying its state). The execution environment provides a set of thread pools that perform these behavioral operations of actors.

Actor is a very simple model and it solves the problems mentioned earlier:

  • Continue to use encapsulation, but with signaling to ensure that execution is not passed (method calls require execution threads to be passed, but messages do not).

  • Do not need any lock, modify Actor internal state can only through the message, Actor is serial processing messages, can ensure the correctness of internal state and variables.

  • Because locks are never used anywhere, the sender is never blocked, and thousands of actors can be properly distributed across dozens of threads, taking full advantage of the potential of modern cpus. The task delegate pattern works well with actors.

  • The state of the Actor is local and not shareable, and changes and data can only be passed by message.

Actor gracefully handles errors

Actors no longer use shared stack calls, so they handle errors differently. There are two kinds of errors to consider:

  • The first is when a task delegate fails on the target Actor due to a task error (typically validation errors, such as a nonexistent user ID). In this case, the Actor service itself is correct, but the corresponding task is wrong. The service Actor should want to send the Actor to send a message that has been notified of the error. There’s nothing special here; errors, as part of the Actor model, can also be treated as messages.

  • The second case is when the service itself encounters an internal failure. Akka forces all actors to be organized into a tree-like hierarchy, meaning that the Actor that creates another Actor becomes the hierarchy of that new Actor. This is very similar to how an operating system combines processes into a tree. Just like processes, when an Actor fails, its parent Actor is notified and reacts to the failure. In addition, if the parent actor stops, all of its children are recursively stopped. This form is called supervision, and it’s at the heart of Akka:

actor tree supervision

The supervisor can implement different strategies depending on the type of failure of the supervised (subactor), such as restarting the Actor or stopping the Actor to let another Actor perform the task instead. An Actor does not die for no reason (unless there is an infinite loop or something like that), but fails and can pass that failure on to its overseer for a fault handling strategy, or of course be stopped (and receive message instructions if it is stopped). An Actor always has a supervisor that is its parent Actor. Actor reboots are invisible, and collaborating actors can send messages on their behalf until the target Actor has successfully restarted.