Statement: This article was published in the 12th issue of Financial Electronic Magazine in 2016. It is the author’s original work. Unauthorized reprinting by individuals, media, public accounts or websites is declined.

As new infrastructure, mobile Internet, big data and cloud computing have given birth to a new Internet economy and are promoting the upgrading of all industries. In the past decade or so, financial services have developed rapidly. Mobile payment has supported the online and offline transformation of the retail industry. Credit services based on big data have supported the entrepreneurship and innovation of numerous small and micro enterprises. The new finance, driven by data and technology and based on a new credit system, has become the cornerstone of the new economy.

With the exploration of Ant Financial in the new financial field, the technical team of Ant Financial is also constantly expanding in the field of financial technology and architecture. From one transaction per second in 2005 to 85,900 transactions per second in 2015, from single payment to cover micro loans, financial management, insurance, credit, banking, etc., through more than ten years of exploration and practice, We have formed a complete set of architecture and technology system including financial level distributed transaction, distributed big data analysis and decision making, etc.

In this article, we will exchange with you the practice and experience related to financial level distributed transaction.

Key objectives of the financial level system

If building a system is like building a building, there are four pillars to build a conventional system: high availability, safety, performance, and cost. But to build a financial edifice in the mobile Internet era, in addition to the above four pillars need to be more solid, two more pillars need to be added: capital security and data quality. These six pillars are our primary goal when we build every system in Ant Financial.

Specifically, we have the following key objectives for a financial level system:

High availability: More than 99.99% high availability. The system can tolerate the failure of various hardware and software facilities, upgrade without interruption of service, guarantee the promised quality of service in harsh application scenarios, and tolerate various human errors. For critical systems, the remote Dr Capability is required.

Security: Capable of detecting, sensing, and defending against various security attacks at multiple levels. The system is capable of analyzing system behaviors and data flows in real time to detect anomalies, and can quickly mobilize resources to block large-scale and organized attacks if necessary.

Performance: For real-time transaction services, extremely fast response time and extremely high concurrency capabilities are required. For batch services, great throughput is required. In particular, the system must have strong scalability and flexibility, and can quickly mobilize resources to deal with unexpected traffic when needed.

Cost: In the context of high availability, security, and performance, cost is an important constraint. We take the average processing cost of a single transaction (total monthly transactions/monthly cost) and the processing cost of peak transactions (additional cost per 1000 transaction TPS increase) as two key indicators to continue optimization. In addition to the extreme optimization of basic hardware, software and system key links, flexible resource scheduling and on-demand scalability are the key to cost optimization.

Security of capital: This is a key difference between the financial system and the conventional system. In order to make absolutely no mistakes in capital processing, it is necessary to have strong consistency between transaction and data, not to lose good data in any failure scenario, to have quasi-real-time transaction capital checking ability, and to have fine circuit breaker and quick recovery ability in abnormal scenarios.

Data quality: Data quality is the foundation of financial service quality. Data collection, generation, transfer, storage, calculation and use need to go through many links. To ensure that data is still accurate, complete and timely after so many links, the system needs to have the ability of data quality control and governance of the whole link.

Can the financial transaction system be distributed? How to achieve the above 6 key goals based on distributed ideas and technologies? Next, we will share our views on this issue based on ant Financial’s practice.

Distributed financial transaction architecture and technology

Strong consistent microservices: microtransaction architecture

Microservices is a widely used distributed architecture. By breaking the system down into single-responsibility, highly cohesive, loosely coupled, independently deployed, autonomously operated “micro” services, the flexibility and scalability of the system can be greatly enhanced. However, as each microservice is a self-contained data and computing unit, when a transaction with strict consistency requirements is executed on many nodes, how to ensure the strong consistency of data and service processing at financial level becomes a difficult problem. Although database or data middleware supporting distributed transactions can be used to ensure the consistency of data distribution, it cannot solve the consistency problem when services are distributed. Because distributed transactions lock resources for a long time and have large granularity, it also restricts the scalability and high availability of the system.

To solve this problem, we propose a microtransaction architecture that enables microservices to have strong consistency. In this architecture, microservices involved in transaction operations have transactional properties. A microtransaction provides three try-confirm-cancel (TCC) operations. The Try operation checks services and reserves resources, the Confirm operation performs actual operations, and the Cancel operation releases reserved resources. A complete transaction consists of a series of micro-transaction Try operations. If all the Try operations are successful, the micro-transaction framework will unify Confirm and Cancel at last, thus realizing strong consistency similar to classical two-stage commit protocol (2PC). But unlike 2PC, the microtransaction architecture strives to be efficient and scalable. The three TCC operations are all short transactions based on local transactions. The Try operation only reserves necessary business resources. For example, if a transaction involves 10 yuan, only 10 yuan is reserved in the account instead of locking the entire account.

Since its launch at the beginning of 2008, the microtransaction architecture has been applied to various financial business scenarios of Ant Financial and has gone through the tests of various promotion peaks, proving the feasibility of this architecture and technology.

2 Financial level distributed database: OceanBase

At present, major commercial databases are stand-alone systems in nature, whose capacity, performance and reliability all depend on the combination of a single or a small number of high-performance servers and highly reliable storage, which is costly and difficult to expand. Although the problem of horizontal scalability can be solved by using a microtransaction architecture to split the pressure on data operations across multiple databases, the performance, cost, and reliability of the database itself remains a challenge. Therefore, Alibaba and Ant Financial started to develop OceanBase, a special financial distributed database, from 2010.

OceanBase makes a breakthrough in the traditional database architecture in the following aspects:

High performance: A notable feature of a database is its large total volume, but the daily changes (additions, deletions, changes) are only a small fraction of the total volume. So OceanBase divides the data into baseline data and change increments. The baseline data is a snapshot of the database at a certain point in time and is stored in the disks of each OceanBase server. The change increment is the data added, deleted, and modified after the snapshot point. It is relatively small and usually stored in the memory of each OceanBase server. In this way, the add, delete and modify operations are basically carried out in memory, so that the transaction processing performance is close to the memory database.

Strong consistency: it is difficult to combine high availability and strong consistency with the classic master + standby database. To address this issue, OceanBase uses a multiple data copy (>=3) voting protocol. For each write transaction, OceanBase does not answer the customer until the redo log reaches more than half of the servers. In this way, when a few servers (such as 1 in 3 or 2 in 5) are abnormal, at least one of the remaining servers has transaction logs, which ensures that the database will not lose data due to the failure of a few servers.

High availability: The database for key services must be 99.999% available. The database cannot be unavailable due to server faults, equipment room faults, or network faults. OceanBase is usually composed of a cluster of machines distributed in multiple machine rooms (3 or more). Each cluster has complete data. One cluster serves as the primary database to provide read and write services externally, while the other clusters serve as the standby database to receive transaction logs and playback logs from the primary database. When the primary database fails, the remaining clusters will immediately automatically vote to elect a new primary database. The new primary database will obtain the latest transaction logs that may exist from other clusters and play them back, and then provide external services after completion.

At present, OceanBase has stably supported the core transaction, payment and accounting of Alipay and the core system of Online Merchant Bank. It has experienced the tests of “Double 11” for many times and formed a cross-room and cross-regional high-availability architecture, and played an important role in daily operation, emergency drill and disaster recovery switchover.

3 Remote Live and Dr: Unitary architecture

Geo-redundant deployment is widely used in the financial system for cross-data center expansion and cross-region DISASTER recovery (Dr) deployment. However, it has some problems: In terms of scalability, the cross-region backup center does not carry core services, which cannot solve the problem of cross-region expansion of core services. In terms of cost, the Dr System is used only for Dr, which has low resource utilization and high cost. In terms of Dr Capability, the Dr System has low availability and high switchover risk due to cold standby waiting.

Therefore, Ant Financial does not choose the deployment mode of “two places and three centers”, but implements the remote multi-activity and disaster recovery mode. The basis of the remote multi-live and Dr Architecture is system unitary. Each unit can be thought of as a scaled-down, full-featured system that includes everything from access gateways to application services to data storage. Each cell is responsible for a certain percentage of data and user access. The unit has the following key features:

Self-inclusion: For example, all calculations and data involved in a user’s account top-up transaction are completed in one unit;

Loose coupling: Only service calls can be made across cells, not direct access to databases or other storage. For some transactions that must be processed across cells, such as transfer transactions between users belonging to two different cells, the number of service invocations across cells should be as small as possible and be processed asynchronously as business and user experience allow. In this way, cross-cell access delay can be tolerated even if two cells are thousands of kilometers apart.

Failure independence: a failure in one unit does not spread to other units;

Disaster recovery: Units back up each other to ensure that each unit has units in the same city and in remote locations that can take over during a failure. In terms of data backup between units, we mainly use the multi-place multi-center strong consistency scheme provided by OceanBase.

Using the unitary architecture, a large-scale system can be divided into many independent small-scale systems. Each unit system can be deployed to any data center in any region, thus realizing the flexible deployment mode of remote MULTI-data center. The primary scaling mode of the system becomes the increase and decrease of cells, but the internal size and complexity of a cell remain unchanged, reducing the complexity of the system. Fault isolation between units reduces the impact of hardware and software failures. The “live” unit and quick inter-unit switchover make the same-city and remote Dr Processing easier and more efficient.

At present, ant Financial’s core system has been distributed in multiple data centers in Shanghai, Shenzhen, Hangzhou and other cities. The core transaction flow is distributed in each data center, and can be scheduled and switched. Through remote multi-activity, the system can be arbitrarily expanded in the whole country, the server resources are fully utilized, and the system’s ability to deal with regional disasters is improved.

4 On-demand scaling: Elastic hybrid cloud architecture

Every year, the Alipay system has to deal with the extremely high transaction volume of events such as Singles’ Day and Chinese New Year red envelopes. While the unitized architecture allows us to cope with peaks, to reduce peak resource investment, the system also needs to be able to scale on demand.

Our solution to this problem is to quickly apply for resources on the cloud computing platform, build new units, and deploy applications and databases before the event. Then, traffic and data are “ejected” to a new unit to rapidly increase system capacity. When the activity is over, the traffic and data are “bounced back” to release resources on the cloud computing platform. In this way, resource procurement and operation costs can be greatly reduced.

Elastic operation requires coordinated operation among traffic, data, and resources. Elastic operation of stateful data is the most difficult task, and requires that services are not interrupted and data consistency is ensured. These operations would be very complex and inefficient if performed manually by o&M personnel, requiring the support of architecture, middleware and control system.

There are some key points in practice for resilient hybrid cloud architectures and technologies:

1. Flexibly apply for and allocate computing, storage, and network resources, create units, and rapidly deploy databases, middleware, and applications through unified resource scheduling;

2. The middleware decouples the application from the infrastructure, and the application system does not need to change no matter how traffic, data and resources are distributed;

3. Ensure that all requests, services, data and messages have globally unique ids and consistent ID coding rules through distributed architecture and data specifications as well as middleware support. According to ID, service request and data access can be correctly routed from access gateway, service middleware, message middleware, data middleware, etc.

4. Through the unified control platform, the flexible operation of the high-level is translated into the deployment and configuration instructions of each component, and the unified scheduling and execution makes the operation coordinated, accurate and efficient.

Based on the elastic hybrid cloud architecture, 10% of alipay’s payment flow ran on Aliyun computing platform on November 11, 2015. On November 11, 2016, we plan to run 50% of the peak payment flow on Ali cloud computing platform, which will bring great cost optimization.

Future prospects and expectations

The practice of Ant Financial proves that with the support of finance-level middleware, database and cloud computing platform, distributed architecture can be fully competent for complex and demanding finance-level transactions, and provides a technical architecture and implementation route for reference.

In the future, Ant Financial will continue to explore and pioneer in financial distributed architecture and technology. In this area, we present ourselves with two major new propositions:

1. How to handle 100 million transactions per second: In the Internet of Everything era, the ubiquity of trading terminals and countless new trading scenarios will continue to drive exponential growth in financial volumes. What kind of architecture and technology can handle the massive transaction in the era of the Internet of everything needs to be prepared for breakthroughs and breakthroughs.

2. Turn the finance-level distributed architecture and technology into “inclusive” cloud computing services, serving millions of financial service institutions. To achieve this goal, Ant Financial and Ali Cloud jointly put forward the “Mafengyun Plan” to jointly build a new generation of financial cloud platform to serve 50,000 financial institutions around the world in the future and create globalized inclusive finance.

China has been leading the world in financial service innovation. New finance needs new technology as chassis and engine, which is the challenge and opportunity of China’s financial technology. Ant Financial is looking forward to working with experts in China’s financial industry to jointly create new financial technologies in the future and make more Chinese contributions to world technology.

Financial Class Distributed Architecture (Antfin_SOFA)