Starting with this chapter, we’ll talk about microservices. Here, we will start with the scene and gradually expand the explanation, so as to quickly grasp the realization principle of some components of micro-service and finally understand the essence of micro-service architecture.

I. Business Scenario (8)

At present, the company has more than 50 services, and there are calls between services, and these services are written in various languages, such as Java, Go, Node.js.

Because of cross-language, and the current popular Spring Cloud, Dubbo are targeted at the Java language, so we do not use Spring Cloud, Dubbo these microservices framework.

So how do we configure the invocation relationships between services? Let’s go back to the configuration process.

Since all 50 services have load balancing, we first need to configure the service address and load balancing on Nginx, like this:

upstream user-servers {

  server 192.168. 5150.:80;

  server 192.168. 5151.:80;

}

upstream order-servers {

  server 192.168. 5153.:80;

  server 192.168. 5152.:80; }... server{ listen80;

  server_name user-servers;

  location / {

    proxy_pass http://user-servers;

    proxy_set_header Host $host;

    proxy_set_header X-Real-IP $remote_addr;

    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

  }

}

server{

  listen 80;

  server_name order-servers;

  location / {

    proxy_pass http://order-servers;proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }}Copy the code

The calling relationship between services is mainly configured through the local configuration file, as shown in the following code:

user.api.host=https://user-servers/
order.api.host=https://order-servers/
Copy the code

Configuration process description: We first obtain the host address of the service to be called through the local configuration file, and then add the URI to the code to assemble the URL, and then all calls between services are through the Nginx proxy, the architecture of the call relationship is shown as the following figure:

So, what are the problems we run into in this architecture?

Second, problems with the old architecture

1. Complicated configuration and error prone on-line

This problem often occurs during live deployments because Nginx needs to be manually configured every time a service is added/machines are added/machines are subtracted, and each environment is different, making it easy to make errors.

Therefore, when a server migrates or the network changes, we need to go through these configurations again and do multiple rounds of testing to make sure that everything is ok. If we don’t check carefully, some node load balancing errors may not be known.

2, add machine to heavy

After the company’s flow up, through monitoring we found that some services need to increase the machine, this time the most test of the system’s resistance to pressure. Because this process requires manual configuration, the system can make mistakes if you’re not careful, such as accidentally hitting the keyboard and typing an extra character or not typing the right IP.

If the system fails, we need to restart Nginx. Let’s imagine if you were operations, would you dare to restart at that time? If the reboot fails, it’s all over. Therefore, we need to make sure that the configuration is correct within a short period of time, because adding machines is a very urgent matter, which will not leave us too much time to check.

Nginx single point

Because all services need to go through an Nginx proxy, Nginx can easily become a bottleneck. And if the Nginx configuration goes wrong, all the services become unusable, which is a big risk. Ok, let’s make each service have its own Nginx instead of all backend services sharing one Nginx. This method can be, but this way is also very pit dad, when the configuration is more, the probability of operation and maintenance error is also large.

4. Management difficulties

In practice, due to compliance requirements, we often need to upgrade the whole system call library. In order to ensure that all services are not omitted, we must have a list of background services.

Considering that the backend service list is maintained manually, we need to organize it regularly, which is a real chore. We’ve tried a number of solutions to this problem, and I’d like to share three that worked.

** (1) ** After the service list of all background services and the server node list of each service are pushed to all background services, the background services themselves control which node of which service is invoked. This is what Spring Cloud and Dubbo do.

** (2) ** Deploy all the services on the container, and then use Kubernetes’ Service and Pod features for Service registration discovery.

Specific operation: K8s can start multiple User pods by attaching the “user-app” tag to the Pod where the User Service is deployed. One Service is called “User Service”, which is dedicated to handling the Pod labeled “user-app”. Requests from the Client are first sent to the User Service and then automatically load balanced to the Pod of a User Service. (For your convenience, this is a simple introduction, if you are interested in Kubernetes you can read more.)

** (3) ** Each service will automatically register the service and IP to the coordination service (such as ZooKeeper), and then design a tool to automatically obtain the machine list of background service in ZooKeeper, and finally update the Nginx configuration according to the list, and then restart.

Finally, we adopted the first solution.

The reason for not using the second solution was that we were not familiar with containers at that time, and a few years ago, the production environment of containers was not so mature, so it would be too costly and risky for us to migrate all services to containers.

The reason for not using the third solution is that it does not solve the problem of Nginx single point bottlenecks and the need to restart the machine after being added.

Therefore, our final solution is shown in the figure below:

Through this architecture diagram, we can see that the whole solution process is divided into several steps:

  1. Each background service automatically registers the service type and IP with the central storage;
  2. The central storage will push the service list to each background service;
  3. Background services perform load balancing locally and access different nodes of the same service in turn.

So with that out of the way, let’s take a look at some of the considerations. Here are four do’s and don ‘ts that I hope will help you.

Three, notes

1. What technology does central storage use?

In fact, through the introduction of the above content, we found that this problem can be solved by using a Redis, but we need to consider the following two requirements:

  • Service change demand, real-time push all background services. For example, when we add a server node, the server node will automatically connect to the central storage when it is started. When the background list is updated, how can other background services receive update requests in real time?
  • Monitor the status of all background services at any time, if a service down, inform other services in time.

For the above two requirements, the middleware of distributed coordination service can just meet all of them, so we finally choose to use distributed coordination service to store the server list.

2. Which distributed coordination service to use?

As for the question of which distributed coordination service technology to use, there is a super detailed technology comparison table on the Internet, you can refer to it.

So now you know what to do? In fact, in the actual technology selection process, we need to consider not only the technology itself, but also the background of the organization. For example, our company was already using ZooKeeper at that time. As for the operation and maintenance team, they usually do not maintain two coordination service middleware at the same time, so we did not choose a coordination service other than ZooKeeper.

  1. What functions need to be implemented based on ZooKeeper?

The main points we need to implement on our side are:

  • When the service starts, the information is registered with ZooKeeper.

  • Pull down all background service information from Zookeeper.

  • Monitor the ZooKeeper event and update the local list if the background service information changes.

  • If you need to implement a load balancing policy when invoking other services, you can use polling (Ribbon).

Taken as a whole, the above points are not at all complicated to implement.

  1. What if ZooKeeper breaks down?

Because background services are deployed in multiple units, for example, when a node breaks down, we need to ensure that other nodes with the same service can work normally, so our focus is to ensure the high availability of the Zookeeper cluster. (ZooKeeper has its own clustering capabilities, which we won’t go into here.)

The ZooKeeper design sacrifices high availability for consistency. It functions as Leader, Follower, and Observer. If the Leader or half of the followers go down, ZooKeeper proudly goes into protracted recovery mode. During this period, Zookeeper does not accept any requests from the client, which causes the following problems.

  • Assuming that the backend server already has a list of all backend services locally, this is a good time. You just need to make sure that the new backend server has not changed during this time.

  • If the server changes during that time, the call may fail.

  • Assuming that the backend service is started during ZooKeeper restoration, it cannot connect to ZooKeeper or get the backend service list, which is the worst.

That sounds like a pretty big hole. What should we do about it?

At that time, our practice was to synchronize all service lists to the configuration center through a specific service. When the new background service could not obtain the service list, it would obtain it from the configuration center. Although this approach can not solve 100% of the problems, it is already a cost-effective solution, and so far, our solution has not occurred those problems mentioned above. (Good luck, too.)

Four,

In fact, this architecture is a bit like building your own wheel, because registration turns out to be exactly what Spring Cloud or Dubbo already does. However, the point of this article is to help you understand the implementation of service registry discovery in microservices from several other perspectives.

If you have better ideas and solutions, welcome to leave a comment to discuss.

Pay attention to personal public number: server technology selection.