Project introduction

Martians – Cloud is an official distributed component of Martian. It is based on contagion and does not require a registry

  1. Completely discard the registry, and do not rely on any registry, the use of contagion mechanism to realize the discovery and governance of services

  2. The call between services adopts the REST style

  3. Intrusion into the Martian is minimal

So what is the mechanism of contagion

1. The conventional distributed model is [producer -> registry -> consumer]. The producer will interface to the registry, and the consumer will discover other services from the registry and implement the invocation

2. The contagion mechanism is to discard the registry. Interfaces can be treated as viruses and services as human beings.

How is it done?

Suppose there are three services

At this point, we need to publish these three services, so we can first plan and link them together, which means that the configuration of who is connected to who.

The connection can look like this [Figure 1]

It can also be like this [Figure 2]

It can also be like this [Figure 3]

As long as you don’t leave any service alone, you can even get tied up (though not recommended).

After the connection, it’s publish time, so what happens between these services when publish time comes?

What happens when you publish?

Let’s take [Figure 1] as an example

1. First, we start service A. Since other services have not been started, A cannot connect to B, so the local interface cache table of A is empty at this time, as shown in the following figure

2. In case you think the process is too ideal, I’m going to start C instead of B

After C is started, B is not started yet, so it cannot be discovered. At this time, it is isolated, so the interface of the local cache is still as follows:

3. The next step is to start B. After B is started, A will immediately discover the interface, so A will obtain the interface from B. The local cache is as follows:

After obtaining the interface, A will do another thing, that is, broadcast. The process is as follows:

1. The local cache is for interfaces, and many interfaces are from the same service. Therefore, the IP addresses and port numbers of these services need to be extracted from the local cache

2. After the step 1, will be A group of IP and port number (in this example, extract is the IP and port B) A will all interface (is own all interface, not the local cache interface) radio to this batch of IP and port number, (in this example, A will give their own interface broadcast to B)

After being broadcast, the local interface cache now looks like this:

This is how A discovers B, so how does C’s interface infect others?

We just used [Figure 1] as an example, so we can see from [Figure 1] that B is connected to C. Therefore, when B is started, it will not only be discovered by A to complete the series of procedures described above, but also discover C. After discovering C, it will obtain the interface from C once, so the local cache is as follows:

After B gets the interface, it will still make A broadcast like A, after which the local cache will look like this:

Now, what’s interesting is, how do A and C get infected?

In a nutshell, let’s review the service startup process:

1. Obtain the interface from the connected service [if the service is already started, it randomly extracts a service from the local cached interface to obtain the cached interface on that service]

2. Broadcast these services [ignore services that have already been broadcast]

In fact, the process is polling, not one-time, so it is then A’s turn to perform the process again, and when he performs the process again, he will get the interface from B to C and broadcast his interface to C, so now it looks like this:

In this way, all services are discovered by the other party.

What if the service is down?

1. The selfish mechanism

The so-called selfish mechanism is that each service only cares about itself and does not care about others. If each service finds that its local cache interface is not connected, it will drop it from the local. As for others, it does not care about them.

2. Voting mechanism

This is internal vote, each service has nothing to do with the outside, if a service to find his local cache an interface connection, then he will cast a vote for the interface point to service, to make it from the machine offline, after the adjustment through the number 0, when the accumulation to a certain extent, the service of all the interfaces will be cleared from the current service. Each service has a mechanism to maintain its own local interface cache.

3. What if it’s a misjudgment

Have A compensation mechanism, is each service in other services, will send the service to be under A notice, let him have been broadcast from the list to remove (such as A service of the B service interface, when accumulated to A certain degree of the votes, A to B all interfaces, clean up after A to B to send A notification, Let B remove A from the broadcast list, so that if B does not hang, then B will re-broadcast the interface to A in the next poll.)

If service B is clearly not suspended, but service A is continuously disconnected, and even the offline notification cannot be notified to service B, then I can only say that service B deserves it, even if it is A misjudgment is better than leaving an error affecting performance.

4. There are many cases of impassability, not necessarily the service is suspended, so what kind of situation will vote for the service offline

Quite simply, when an interface is called, three exceptions occur, and a vote is cast

LConnectException, failed to connect, this is not a 404, this is not connected at all to the IP :port

LUnknownHostException: the address cannot be resolved. The provided IP address :port cannot be resolved

LSocketTimeoutException, connection timeout, not read time out, but connect time out

5. Then there is the garbage collection mechanism

Garbage collection is as simple as periodically scanning the local cache for the interface of the offline service and then deleting it.

This mechanism ensures that when a service goes down, the interface is automatically disconnected from another service

How to implement contagion when the linked service is down

If B dies, the chain is broken, will infection be affected?

Well, it doesn’t, because this chain only works when you start it, and then it doesn’t work when you start it, so take A, A only works when you start it

It will go to B to obtain the interface. In the next polling, it will randomly select a service from the local cache interface to obtain the interface, so the chain will not be broken.

As for broadcasting, it is also broadcast to the local cache, not the configured service.

So outages do not affect interface contagion

How about adding a new service

It’s as simple as connecting it to any of the running services and soon it will be covered with viruses (interfaces)