Author | Wu Zhenhe
This paper takes boshi Fund’s financial scenario as an example to illustrate RocketMQ’s role in improving customer companionship efficiency and enriching financial scenarioization capabilities.
Industry background
The core business of fund companies is mainly divided into two parts, one is investment and research line business, namely investment management and industry research business, which reflects the core competitiveness of fund companies. The other part is market line business, that is, fund companies use their own channels and market ability to complete fund sales and customer service.
As one of the first five fund management companies established in Mainland China, By June 30, 2021, Boshi Fund Management has 276 public funds under management, with a total assets under management of more than 1,548.2 billion YUAN and a total dividend of more than 146.5 billion yuan.
With the development of Internet technology, fund sales channels become more diversified, and online has become an important channel for fund sales. Compared with traditional fund customers, online channels have the characteristics of large customer base and uneven levels. For those customers who are not mature, we need to do a good job of companionship, let them understand the risk, understand the investment.
Application of RocketMQ in companion system
1. Overview of accompany scene
Boshi Fund has established a comprehensive and multi-level company system to provide users with warm investment company experience before, during and after investment from the user level, market level and product level.
The realization of each accompanying scene requires the cooperation of different teams from multiple departments of the company. Rely on and investment research, compliance, operations, big data and other upstream and downstream systems. However, these systems may adopt different technical architectures and implementation methods. If synchronous call method is adopted to realize collaboration, the coupling degree is too high, which is not conducive to future expansion.
2. RocketMQ decouples heterogeneous systems
RocketMQ provides efficient and reliable messaging features and a publish-subscribe mechanism that is ideally suited for such decoupling between upstream and downstream heterogeneous systems. We have made all the original collaboration methods based on files and emails online, process-based and institutionalized, greatly improving the efficiency of accompanying output. For such collaboration involving multi-party systems, messages need to be properly categorized for filtering and indexing. RocketMQ provides topics and Tags to do just that.
3. Topic and Tags Best practices
Topic and Tag, as identifiers used for business classification, belong to first-level classification and second-level classification respectively. Such hierarchical classification identifiers are similar to enterprise organizational structure and can be combined to realize message filtering. For example, for the Topic of the accompanying system, the operation system subscribes to the operation message, we label this kind of message with TagA, the customer service system subscribes to the customer service message TagB, and the accompanying arrangement system subscribes to the arrangement message TagC. The compliance system needs to conduct compliance review on the operation and accompanying message. So it needs to subscribe to TagA and TagC, and finally the data center, all the messages are processed, so it needs to listen for all the tags.
Financial application scenarios for RocketMQ transaction messages
1. Overview of financial scenarios
Next, let’s take a look at the typical financial scenario — discount. In bo shi fund APP explain buy fund to be able to enjoy low to 0 discount rate privilege, how is specific business realized? There are two ways. The first way is to recharge the Boshi wallet, and the bottom layer is to buy a monetary fund for the customer, and then use the Boshi wallet to buy the target fund. This method requires the user to operate twice, which is tedious and easy to cause the loss of customer orders. Another approach is the discount, which wraps a two-step purchase fund into a transaction. For investors, after opening preferential purchase service, operation less one step, investment is simpler!
2. Theoretical model of domain events
A domain event is a step in a business process that leads to further business operations, such as a login event, a fund purchase event, and so on. In the domain model, domain event transactions adopt final consistency, which is a kind of weak consistency, different from strong consistency. When domain models are mapped to microservices system architectures, the data between microservices need not be strongly consistent, so domain events can decoupled microservices. Depending on whether it is cross-microservices, it can be divided into two scenarios:
Scenario 1: when domain events occur on the same microservice. Because most of the events occur within the same process, you have good control over the transaction itself. However, if an event needs to update multiple aggregations at the same time, following the DDD principle of updating only one aggregation at a time, you need to introduce an eventbus, which is the eventbus pattern.
The second scenario: cross-microservices. There are more scenarios in which domain events occur between microservices, and the event processing mechanism is more complex. Events across microservices can drive business processes or data directly across different subdomains or microservices and therefore require a coordinator to facilitate global transactions. The cross-microservice event mechanism should consider event construction, publish and subscribe, event data persistence, message middleware, distributed transaction mechanism and so on. Among them, message middleware with transaction message function is the core component of this solution.
3. Comparison of cloth transaction schemes
In bosh fund’s business scenario, the problem to be solved is the conflict between transaction consistency and service decoupling degree, so our goal is to decouple master and slave transactions to ensure the stability of the core logic, while not sacrificing the ultimate consistency because of decoupling. As a result, several different solutions were proposed:
- The first scheme: the most common common message + asynchronous reconciliation. The problem of this scheme is that the execution of the main transaction and the success of joining the queue cannot be guaranteed at the same time, and reconciliation compensation with low timeliness is needed to solve the problem, but the consistency is high.
- The second option: local message table, as opposed to the previous approach, is that the business puts the write message table into the main transaction, making the main transaction and enqueue an atomic operation, and then the business reads the enqueue record and posts it to the slave transaction. The disadvantage is that the main transaction and the message table are stored coupled without decoupling.
- The third option is to introduce XA transactions, a two-phase commit protocol, which is more difficult to implement. In addition, there are two problems: one is that this is a synchronous blocking protocol, so the concurrency is not too high due to lock occupation; the other is that in the XA transaction process, if the coordinator fails after the participant votes for it, the node does not know whether to commit or stop and can only wait for the coordinator to recover. Service interruption may occur at this point.
- The fourth scheme: TCC, which deals exclusively with distributed transactions, only focuses on consistency and has no degree of decoupling, which is also not feasible.
- The fifth option, transactional messages, is the most suitable pattern for both decoupling and consistency.
Ultimately we chose RocketMQ transaction messages as the solution for distributed transactions.
4. RocketMQ transaction message core process
RocketMQ based transaction messages build a transaction center to coordinate the advance and rollback of distributed transactions. Taking preferential purchase as an example, the core process is as follows:
- Phase 1: Prepare phase. In the Prepare phase, the service system sends the semi-transaction message of RocketMQ to the transaction center. The transaction center does not release the message and waits for the second confirmation. At this stage RocketMQ’s half-message is not perceived by the consumer.
- Phase two: The operational system performs the main transaction, namely, the purchase of money funds.
- The third stage: commit to the transaction center after the success of the main transaction, and the transaction center delivers messages to the slave transaction. If the main transaction fails, send ROLLBACK to the transaction center. The reason for the two-phase commit here is that the normal enqueue operation does not guarantee final consistency whether it is placed before or after the main transaction. If the main transaction is executed before joining the team, the service may be down before joining the team, and there is no chance to join the team again. If the primary transaction is queued before it is executed, then the primary transaction may not succeed, but the secondary transaction will succeed and the business logic will be distorted.
The secondary confirmation of transaction messages may be lost due to network jitter. You need to rely on some mechanism to restore the context of the entire distributed transaction, and RocketMQ provides a backcheck mechanism designed to address timeouts in distributed transactions. The reverse check mechanism of our transaction center mainly checks the internal state of the transaction center first, and then checks the execution results of local transactions through the reverse check interface. After the transaction context is restored, the subsequent process is normally promoted.
5. How does RocketMQ ensure that transaction messages are properly consumed on the consumer side
The MQ server needs to retry a certain number of times after the consumer fails, and we need to set a reasonable retry policy. Since there is consumption retry, this requires that the consumer interface be idempotent; If the message fails after multiple attempts, the message is pushed to the dead-letter queue DLQ. RocketMQ provides the dead-letter queue function, which generates an alarm for the message entering the dead-letter queue.
6. Application scenarios of transaction messages
Type 1: Domain events that need to be executed synchronously. For example, the logic failure probability of domain events is high, and the service needs to inform the client of the return code in time. Therefore, it cannot be placed in an asynchronous process. For example, anyone who has worked on a payment system knows that you need to check if the balance is sufficient before you make a deduction, and if it is insufficient, it will fail as many times as you can in an asynchronous process.
The second scenario is a non-reentrant transaction scenario. For example, if a unique transaction ID is not determined when a business system sends a message, the subsequent business logic cannot be guaranteed to be idempotent. Assume that one of the transactions is to create an order. Therefore, transaction messages are used to identify the start of a distributed transaction and generate a unique transaction ID so that subsequent processes can use this transaction ID to ensure idempotent.
The future planning
Currently, we use RocketMQ to decoupled upstream and downstream services from our customer companionship system, improving the efficiency of operations and companionship. At the same time, we built a service coordination platform supporting distributed transactions based on RocketMQ transaction messages, namely our transaction center, which greatly enhanced the product packaging capability of financial scenarios. In the future, we will expand more financial application scenarios and create greater business value around the transaction center.