In recent years, thanks to rich scenes and convenient services, the total number of mobile payment users and the frequency of payment continue to increase rapidly, and mobile payment has become a living habit of people. However, there are also hidden worries after convenient. According to the survey report (2017 Mobile Payment User Survey Report), merchants’ unsupport and security risks are the most worrying problems for mobile payment users, followed by payment failure and other problems.

Tencent Cloud Payment is a mobile billing SaaS service launched by Tencent Cloud and wechat Pay with the technical capability accumulated by TEG for many years. It aims to provide merchants with a safe, stable, efficient, easy-to-use and low-cost solution to access wechat payment and help the mobile payment industry develop quickly and healthily.

– the introduction

First, what is cloud payment

1.1 Project Background

Problems faced by wechat Pay:

Uneven quality of ISVS: IsVs developed for merchants to access the wechat payment system are uneven in quality, with low stability and security of the system and poor experience of ordinary users when using wechat Payment, which reduces users’ confidence in wechat Payment.

Problems faced by ordinary service providers:

High technical threshold: Most service providers do not have the ability to develop a receiving system connected to wechat pay.

High system cost: a small number of high-quality systems on the market are expensive and difficult for service providers to bear.

1.2 Project Positioning

Cloud payment aims to provide end-to-end (from users to wechat Pay and other third-party payment channels) secure, stable, efficient, easy-to-use, low-cost commercial payment solutions, improving the last kilometer from users to payment channels.

For wechat Pay: improve the security and stability of the payment process, enhance user confidence and reduce user complaints.

For service providers: to provide a safe, easy-to-use, low-cost, feature-rich commercial payment solution, so that service providers can focus on the promotion of wechat payment.

For users: improve user experience, improve users’ willingness and confidence to use wechat Pay.

1.3 Position of cloud payment in payment link

Second, cloud payment fund security

For payment system, security refers to capital security, which can be divided into two levels of data authority and consistency. The following three sections introduce common payment scenarios and business models, data security assurance and consistency policies respectively.

2.1 Common payment scenarios and business models

ⅰ Pay by card

ⅱ Payment by public Account (one yard)

ⅲ scan code for payment

2.2 Data Security

Data security challenges and corresponding solutions can be analyzed from three perspectives: data transmission, data storage, and data manipulation.

The data transfer

Eavesdropping: Encrypted transmission (HTTPS)

Tamper with: Signature (RSA2)

Man-in-the-middle attack: Certificates

Pseudo server: signature (RSA2), certificate

Data is stored

Drag library: encrypted storage

Tamper with: signature

Lost: Database master/slave, snapshot + operation log, and data redundancy

Data manipulation

Illegal access, illegal modification: permission control, data integrity check

Logic error: request restriction, traffic cleaning

The above can be roughly summarized into four types of challenges: data leakage, data tampering, data loss, and illegal operations.

2.3 Consistency Challenges

Data consistency

Internal exception: Execution flow is interrupted.

Abnormal payment channel: The execution flow is interrupted and the status is unknown.

Network exception: Payment channels are generally accessed through the public network, and network exceptions are common.

Message contention: The payment logic link is long. In the case of poor network, the retry logic will cause message contention.

Message out of order: There are many logical processes in payment class. Generally, the operation flow is divided into multiple steps, so message out of order is inevitable. The use of CMQ for reliable message sending can also cause messages to be out of order when multiple processes send messages simultaneously.

Payment channel interfaces cannot be reentrant: The abnormal recovery logic is complex, error prone, and the recovery period is long.

Data consistency scheme selection:

When it comes to data consistency, there is no escaping CAP theory:

The cloud payment system has its particularity:

1. The upstream and downstream relationship between the cloud payment system and payment channels leads to natural zoning, and P must be satisfied;

2. Payment systems have high requirements for data consistency, and C must be satisfied;

3. Cloud payment requires 99.99% stability, so A should also try to meet it.

Strong consistency sacrifices usability, and weak consistency sacrifices consistency, so neither can be used. Ultimately, you have to choose final consistency. According to the BASE theory, the final consistency is a compromise in the three aspects of CAP. Under the condition that the system is available for most of the time and the data is consistent, the existence of intermediate state (short time, a small amount of inconsistent data) is allowed, as long as the intermediate state is not accessed by external systems (consistent external view). At the same time, it is hoped that the data inconsistency status can be recovered as soon as possible (eventually consistent) by fast recovery.

The embodiment of BASE theory in cloud payment system:

Serialization: Using distributed locks (Tencent MySQL

(TXSQL) lock system extension), serialize external requests, solve the problem of message out of order.

Ordering: The payment process has a strict order state finite state machine (FSM), which can be used to order messages to solve the problem of out-of-order messages.

Transactionalization: Using the ultimate consistency scheme, each complete step in the payment execution flow is seen as an independent complete transaction from the outside of the system.

Reentrant: If a non-reentrant interface occurs, the location and context of the current execution flow must be recorded. During fault recovery, the execution must start from the location where the fault occurred, resulting in extremely complex logic. By reentrant design and implementation of all the interfaces, cloud payment can advance the execution flow to normal flow without recording the fault occurrence point and directly re-execute from scratch when a fault occurs, simplifying the logical design of fault recovery.

Stateless: Similar to the function of reentrant, with reentrant design, all processes are as stateless (or less state) as possible. When designing fault recovery, there is no need to record the execution flow context, simplifying the design of fault recovery logic.

Through this design, the order failure rate of cloud payment so far is less than 1 order per million, and the recovery time of intermediate state is generally less than 10 seconds.

Logical View Consistency

Various payment channels: Cloud payment now has access to 8 payment channels, with large field differences between different channels and different request methods.

The logical view of the interface is inconsistent: WeChat pay, for example, WeChat pay three interfaces can be returned to what seems to be a complete order information (credit card payment complete the callback), query the order, payment, but the three interfaces field is not the same in return, such as lack of vouchers related information), the same interface under different conditions (such as credit card payment interface), Settlement_total_fee returns different fields. For example, settlement_total_fee returns only if a free recharge voucher is used, which affects settlement amount statistics.

Not reentrant: Payment by credit card, for example, has paid a successful order, place the order again, complains (order has been paid), but in special situations (if the network delay, message loss), request not timely return (or normal), and then retry, saw the phenomenon is likely, request didn’t succeed the first time, try again and was told that when the order was paid. This situation can make exception handling very difficult.

Abnormal situation is various, for example, internal abnormal, your bank failure, users pay off the keyboard and the shortage of balance, to sum up is not pay success, in essence, this order still can continue to pay, but because of the wrong type is various, and WeChat pay for certain error conditions (e.g., pay off the keyboard) as a termination payment, the exception handling logic is very complex. Take the cloud payment as an example, when users turn off the payment keyboard, if they just want to continue, the cloud payment has to use the original single data, change the order number, change the payment authorization code and retry.

Solution:

Difference smoothing: Simplifies and consolidates interface semantics by field completion, query compensation, and field fusion to resolve interface logical view inconsistency.

Interface reentrant: Internal interface reentrant is very simple, need to have this concept at the beginning of the design of the system, otherwise when the system is formed to carry out transformation, high cost and risk. Reentrant of the external interface requires a subtle refactoring of the logic, but it is not an impossible task. After the interface is reentrant, the complexity of logical view consistency during fault recovery is basically resolved.

User View Consistency

Successful payment by the user – unsuccessful collection by the merchant: for example, the payment package/callback is lost, resulting in incomplete payment process on the merchant side.

User fails to pay – Merchant considers payment successful: Such as credit card payments are not yet received payment for a long time back to the package of success, check order status get results also pay, in order to prevent users from payment by mistake, merchants from single side will take the initiative to call (in the case of payment success, cancellations will lead to a refund, before WeChat pay refund generally have a longer delay, sometimes even a few days). If the user pays successfully at the moment when the merchant invokes the order withdrawal, the payment is not successful in the eyes of the merchant, but the payment has been successful in the eyes of the user, and the refund caused by the order withdrawal has been successfully applied for, the refund may be successful in a period of time after the user leaves, leading to the loss of the merchant.

Solution:

Discard ambiguous interface: Inside the cloud payment system, there is no call to withdraw order interface at all, so that there will be no accidental refund. Wechat Pay also optimized the card payment interface. If the payment is not completed within one minute, the order will be automatically withdrawn, and the successful order will not be withdrawn, thus completely eliminating the case of unexpected refund and unexpected payment (the payment of the old order a few days ago).

Fault fast recovery: retry (simple failure to restore, such as network failure) and abnormal orders recovery (order) to restore serious mistakes, reason often shorten recovery time will be abnormal order to 10 seconds, even when an exception occurs, the system fault recovery will be as soon as possible, users only need to wait for a moment, you can get the right result, thereby reducing the occurrence of the dispute.

Third, summary

Through the above a series of measures, basically can, on the basis of data safety for merchant services provide a simple and easy to use, consistent data view, logical view is consistent, user view consistent business payment solution, reduce the threshold of the merchants/service providers use WeChat payment, to reduce the error rate, enhance the user confidence, to ensure the safety of users and merchants capital.