preface

The overall architecture of our 2.0 version trading system is as shown in the figure above, which is divided into market services, client services, matchmaking services and management services. Market service mainly provides WebSocket API to push market data externally. The matching service is an in-memory matching engine. Its input is a ordered order queue, and its output contains the transaction record and various other events, including successful cancellation, failed cancellation, order entry into Orderbook, etc. If the matching service is restarted, all uncompleted orders will be queried from the MySQL database and Orderbook will be formed again. The core function of the client service is to receive and process various HTTP interface requests from the client, while the management terminal provides the system administrator with a unified view and management of users, orders, assets, and configurations of the entire system.

It’s broken down to four services, but I think it’s not a microservices architecture, it’s a distributed architecture, but “distributed” and “microservices” are two different things. Microservices are distributed, but distributed does not necessarily use microservices. In fact, in actual projects, from single application to micro-service application is not overnight, but a gradual process of evolution. Version 2.0 is just the first stage in this evolution.

Now, many small teams and projects start with micro-services. Many of them are micro-services for the sake of micro-services, which is definitely not appropriate. In essence, the purpose of architecture is to “reduce costs and increase efficiency” — a mantra I learned from sister Xuan (real name Sun Xuan). The application of micro-services at the beginning of the project generally fails to achieve the purpose of cost reduction and efficiency increase, because the implementation cost and maintenance cost of micro-service architecture applications are much higher than that of individual applications, unless a large-scale application is built at the beginning.

When the scale of business and the scale of developers are not small, it is more suitable to use microservices, at this time, using microservices to solve two problems: fast iteration and high concurrency. When the business and personnel scale is relatively small, use one or several individual applications to complete the entire system, generally faster iteration. But at a certain point, one or more pain points will start to appear, and after that, the iteration will slow down.

In the case of high concurrency, as long as a single application is stateless, multiple application instances can also bear a certain amount of concurrency. However, if a single application becomes large and carries many service functions, the horizontal expansion of the entire single application will waste resources. Not all services need to be expanded. For example, an order is prone to high concurrency and needs to be expanded, but registration does not need to be expanded. All services are bound to the same unit for capacity expansion, consuming huge resources. In addition, when a service has high concurrency and the server cannot bear it, all services of the single application are affected. Therefore, splitting microservices can solve these problems caused by high concurrency.

So, next, let’s talk about how our transaction system, microservitization architecture has evolved.

Iterating business requirements

After version 2.0, it is time to focus on iterative business requirements, with a large number of business requirements to be refined and added. The big business segments include:

  • OTC trading: also known as legal currency trading, OTC trading, C2C trading, etc., in the three major firms (Huobi/Binan /OK) are now called “buy currency”. As a third party, the trading platform provides a safe and reliable trading environment and provides users with the function of buying and selling digital assets with fiat currency. The user can be an individual user or an enterprise user.
  • Leveraged trading: it mainly provides users with the function of borrowing coins for trading. By borrowing coins from the platform, users can realize the goal of doubling profits with a small amount of money. Platforms typically offer up to 10 times leverage.
  • Contract trading: it also includes delivery contract, perpetuity contract and option contract, among which the maximum leverage ratio of perpetuity contract can reach 125 times.
  • Open API: Open API is required for every transaction, including quotation interface, transaction interface, account interface, etc., all of which need to provide HTTP interface, and the update push of quotation data and some account information is coupled with WebSocket interface.

Compared to these transactions, the trading section of the existing business is generally referred to as “coin-coin trading”. In addition, both coin trading and leverage trading belong to spot trading, while contract trading belongs to the category of financial derivatives. Business in these sectors is now standard on virtually every trading platform. In addition, if the business continues to expand, there are cash for interest, lending, mining, DEX (decentralized trading), and various DEFIS (decentralized finance). For example, the business lines of each of the three firms are already very large. But let’s not worry about that business right now.

In addition to these big sectors, there are also some small and medium-sized businesses that need to be supplemented, including: mobile phone number registration, man-machine verification, invitation commission, online customer service, system announcement, operation activities, language internationalization, asset collection, wallet hot and cold separation, etc. Support for Android and iOS will also be added.

However, the requirements of so many large and small businesses above are certainly not achieved overnight, and need to be gradually completed through iteration after iteration according to priorities. According to the priority of requirements, the requirements of small and medium-sized businesses should be completed first, followed by the following: open API of coin trading, over-the-counter trading, leveraged trading, open API of leverage trading, delivery contract, perpetual contract, option contract and open API of contract.

Because the iteration cycle of each version is relatively short, the goal is to quickly realize the function and go online, so we directly add the function of each business section in the original service. The open HTTP API is not separate from the internal API, so it is directly shared with the internal API. Only the parameters distinguish the open API from the internal API. Internal API will pass Token, JWT authentication; The open API will pass the Sign parameter and use the signature verification mechanism of the API Key.

After these business sectors are online, the architecture diagram of our entire trading system will be roughly as follows:

business

After working overtime to put the needs of these business sectors on line, a review and summary will be found that there are several serious problems:

  • The client backend service has become bloated, a lot of business logic inside has become very complex, the submitted code conflicts are more and more frequent, seriously affecting the speed of iteration.
  • The strong coupling between the open API and the internal API causes that when the open API has high traffic, the access to the internal API will be affected, which causes the client users to complain from time to time that the application is slow and stuck, or even times out.
  • When a large number of concurrent transaction requests are received from a service block, the server cannot handle them and all services become unavailable.

The timing of service unbundling is driven by pain points. These problems, is the pain point, that to solve these pain points, the method is a word: “open”. The next question is: How do you break it up?

The essence of microservice separation is to decompose the business complexity and divide the whole system into multiple independent microservices, so that different small teams can be responsible for different microservices, so that the whole microservice can be independently completed by one team from product design, development and testing to deployment and launching. Thus, multiple small teams can develop multiple lines of business in parallel, realizing rapid iteration of the whole system.

Therefore, the first dimension to consider for service decoupling is the independent business domain. Obviously, for our trading system, the business domains that can be split are: spot trading, over-the-counter trading, contract trading. Spot trading includes coin trading and leverage trading, which can not be split, because they are in the same set of matching mechanism, that is, the orders of coin trading and leverage trading are matched in the same order pool, and the market data is also the same set. There are subdomains for contract transactions, and although each subdomain operates almost independently, many of the business rules are pretty much the same, so there is no need for further fragmentation at this point.

Consider the second split dimension, analyzing the business process, and if there are asynchronous operations, they can be split. For the trading system, take a look at the core of the matching transaction process is how, the most common simplified process is as follows:

Order -- > order queue -- > Match -- > Output queue -- > Clear

Before and after the match, there are sequencing queue and output queue respectively, therefore, between the order and match is asynchronous, and between match and clearing is asynchronous. That can place an order, match, clearing separate independent services. Placing orders (including withdrawal orders, etc.) can be an independent part of the transaction services, matching is matching services, clearing logic is removed into clearing services.

The market data module is also relatively independent, so we have been separated from the independent market services.

In addition, leverage trading and all kinds of contract trading have margin system, which requires real-time monitoring of users’ assets and calculation of risk rate. If the risk threshold is reached, corresponding strategies will be automatically implemented, such as forced liquidation, automatic liquidation, etc. It is also best to implement these functions as separate services, which we can call risk control services. This one also requires full memory high-speed computation, and we’ll talk about how to design it later.

In addition to the fact that over-the-counter trading is not a match making transaction, both spot and contract can be further split according to the above split dimension:

The spot contract
Trading service Spot trading service Contract trading Service
Matching service Spot matching service Contract matching Service
Clearing services Spot clearing service Contract clearing service
Quote service Spot quotation service Contract quotation service
Risk Control services Spot risk control services Contract risk control services

In fact, there are some general services, such as user registration, login, announcement content, Banner, online customer service and so on, which can be classified as a public service, do unified management.

Also, the management side of the background service is just some CRUD, there is no pain point problem, can not be split temporarily.

Finally, in the business layer, we will split system for these business services: management background services, public service, over-the-counter spot, spot trading services, matching service, cash settlement service, spot market, spot risk control services, contracts trading services, contract services, clearing services contract, contract price, contract risk control services.

Database splitting

When business services are split, a monolithic database can easily become a performance bottleneck and risk a single point of failure. In addition, there is only one database on which all services depend, and once the database is adjusted, it will affect everything. So, we’re going to split the database as well.

Under the microservice architecture, the independence of a complete set of microservice components is not only the independence of the code on the development and deployment of the business layer requirements, but also the independence and autonomy of the business component to its own data layer and decoupling. Therefore, the ideal design is for each microservice business component to have its own separate database, and other services cannot call your database directly, but can only access the data of other services through service calls.

So, how the database is split is basically dependent on the business components. For our trading system, the matching service and risk control service are full-memory computations without their own independent database, while other services have their own independent database or cache. The diagram below:

But when you split the database and it becomes distributed, it inevitably introduces some new problems. There are three main ones:

  • Distributed transaction problem
  • Data statistical analysis problems
  • Cross-library query problems

For a single database, the ACID of database transactions is easy to achieve. But in a distributed environment, ACID is harder to satisfy and requires a trade-off between certain features. We should know that in distributed environment, there is a CAP theory, namely consistency, availability and partition tolerance, which cannot be satisfied at the same time in distributed system, but can only satisfy two items at most. P is mandatory, so it’s usually A choice between C (consistency) and A (usability). If C is selected, strong consistency is required. Transactions that guarantee strong consistency are also called rigid transactions. The schemes to solve rigid transactions mainly include 2PC and 3PC, which can ensure strong consistency but poor performance. Distributed transactions in most scenarios don’t require too much consistency, so you just need to achieve final consistency in a certain amount of time. The transaction that guarantees final consistency is called flexible transaction, and its design idea is based on BASE theory. Flexible transaction solutions mainly include TCC compensation, asynchronous guarantee and maximum effort. The specific choice of which solution to solve the problem of distributed transactions depends on the analysis and selection of specific business scenarios. More on distributed transactions will be covered separately.

The problem of data statistical analysis is more the demand of the management background, which needs to provide statistical reports, data analysis and other functions for the operation personnel, which can actually fall into the category of OLAP. Therefore, the recommended method is to integrate the data of each library into the NewSQL database for processing.

In fact, the most common scenario is when A business component needs to query the data of B business component or even more other business components. To solve this problem, there are several common solutions. The first idea is to add redundant fields, but the redundant fields should not be too many, and the problem of data synchronization of redundant fields needs to be solved. The second solution is to add aggregation services, which encapsulate the data of different services into a new service for aggregation and provide a unified API query interface externally. Another option is to query each service individually and then assemble the data, either directly on the client side or on the server side.

Horizontal layered

The last step of microservitization is to adopt horizontal hierarchical architecture. The simplest three-tier architecture can be used to divide all microservices into gateway layer, business logic layer and data access layer.

The addition of a gateway layer is understandable and is standard with all microservices. The gateway layer is the total back-end entrance of the entire system. It provides internal access to the official client and management terminal, and external access to third-party applications through open APIS. It is responsible for request authentication, traffic limiting, and routing and forwarding without involving specific service logic.

Adding a gateway layer is a no-doubt, it is not a consideration, need to consider how many gateways to configure the appropriate? A unified gateway cannot solve the problem of strong coupling between open apis and internal apis mentioned earlier, so multiple gateways are definitely needed. Open API and internal API should be separated. They have different authentication methods and traffic limiting policies. More importantly, they should consider isolation and do not affect each other. The API of the management terminal and the client terminal should also be separated. The users and permissions of the two terminals are different. The administrator of the management terminal has the permission to access and operate more data. Therefore, the Gateway layer can be divided into at least three gateways: Open API Gateway, Client API Gateway, and Admin API Gateway. The access relationship with each terminal is shown as follows:

If further subdivided, the WebSocket API and HTTP API can be split again, but can not be separated for the time being.

So much for the gateway layer, then the business logic layer and the data access layer. As far as I know, there are many projects that don’t have microservices that are independent of the data access layer anymore, as have the projects I’ve worked on before. No article or book I had ever read suggested this idea of splitting. This train of thought or I learn from xuan elder sister that, just understand from him there, those big factory project, many are so split.

In the absence of fragmentation, each service is actually divided into a business logic layer and a data access layer. In this case, when the service of SERVICE A needs to query the data of service B, it directly accesses the interface provided by service B. This way, the biggest drawback is easy to cause the data flow is disorder, because it is a horizontal service calls, more and more as the service, the invocation of the relationship between service more and more complex, will become chaotic networks between, prone to the expected result, they may also occur, loop calls, and positioning problem also becomes difficult.

After the split, when A needs to query B’s data, it is the service of A’s business logic layer that calls the service of B’s data access layer, which becomes A vertical service invocation with clear data flow.

At this point, the microservitization split of the whole system is basically completed, and the final overall architecture diagram is roughly as follows:

The last piece of

In fact, the last piece of the puzzle left in microservitization is the registry, which is a fundamental component of microservices architecture.

The registry mainly solves the following problems:

  • How can services be discovered after registration
  • How can I log out of a service in a timely manner
  • How to route a service when it is discovered
  • How can I degrade a service when it is abnormal
  • How can services scale horizontally effectively

Simply put, a registry is a registry that registers and discovers services. At present, there are many kinds of registry selection, including Zookeeper, Eureka, Consul, Etcd, CoreDNS, Nacos, etc. You can also develop your own. So many choices, which one should I choose? In essence, should the registry be based on the CP model or the AP model?

For service discovery scenarios, different nodes in the registry may hold different service provider information for the same service without catastrophic consequences. However, for service consumers, it can be disastrous for the system if consumption fails due to an exception in the registry. Therefore, for service discovery, the registry should be an AP model.

From the perspective of service registration, if network partition occurs in the registry, new nodes cannot be registered in the CP scenario, and the newly deployed service nodes cannot provide services. From the perspective of business, we do not want to see this, because we want to notify the new service nodes to as many service consumers as possible. You cannot invalidate all new nodes just because the registry wants to ensure data consistency. Therefore, the effect of AP is better than CP in the service registration scenario.

In summary, registries should choose the AP model over consistency to ensure high availability. From the registries listed above, the only options are Eureka, Nacos, or diy. In real projects, relatively few are self-developed, but more and more projects now choose to use Nacos because it is the most powerful feature, and Nacos is not just a registry, but a configuration center. So, if you are not self-developing, you can actually choose Nacos directly.

conclusion

The landing of micro-service is far from as simple as many online tutorials say. Only when you experience it, can you know the landing method of best practice. In addition, after microservitization, there are many more complex problems to be solved one by one, including service governance, such as service degradation, circuit breaker, load balancing, etc., as well as service grid, or even non-servitization, which need to be implemented step by step.


Scan the following QR code to follow the public account (public account name: Keegan Xiaogang)