Ant Financial has reshaped payments and changed lives over the past 15 years, providing services to more than 1.2 billion people around the world, which is supported by technology. At the 2019 Hangzhou Computing Conference, Ant Financial will share its technological precipitation over the past 15 years, as well as its future-oriented financial technology innovation with attendees. This is one of the best presentations we have compiled and will publish on the public account “Financial Distributed Architecture”.

Under the development trend of cloud native technology, the financial industry wants to apply cloud native technology, but the security problem is a very big obstacle, and the cloud native community pays far less attention to the security problem. When Ant Financial implemented cloud native, it was the top priority to solve security problems. After exploration and practice, we precipitated a set of full-link financial cloud native security architecture from the underlying hardware to the software and from the system to the application layer.

The financial industry is all about trust, and we believe that the trust that comes from security is the intangible product that underpins all financial businesses.

In line with the development of the Internet era, the financial industry and institutions have also undergone many changes, including more access channels such as App and mini program, faster business changes, and more third-party suppliers. However, no matter how the financial industry changes, there is one thing that remains the same: Zero Fault, Zero tolerance for error, which is a very high requirement for stability and security.

Here, I also want to clear up a wrong view of the financial industry. It is said that financial institutions have many legacy systems and many technologies are more than ten years old, so they are considered backward in technology. But the financial industry has always been very high-tech. There was a movie released a while ago called Project Hummingbird, which is based on a true story, about a group of high-frequency traders, in order to reduce the time from Kansas to the New York Stock Exchange, they build a fiber optic cable thousands of miles from Kansas to New York, and they try to fight for that last millisecond. Therefore, the financial industry does not only have mediocre and conservative technology, but also pursues the most cutting-edge and advanced technology. Our mission is to further arm the financial industry with technology and inject more vitality into the financial technology.

Cloud native architecture represents a kind of new productivity, the financial industry is certainly need cloud native, it brings in cost savings and the ability of agile development, but also need to add a attributive in front of it, is the security of cloud native architecture, it contains not only before the security scheme of relatively simple, but a credible from end to end link security solutions. Including clear code ownership, to be trusted to start, the production and release of images closed, with the account system, clear application ownership and access rights; And a security deployable refined isolation solution that integrates security policies and enforcement into the infrastructure and is transparent to software development and testing.

Here we focus on sharing several cloud native security technologies being practiced by Ant Financial, including cloud native network security Service Mesh, security container and confidential computing.

Cloud native Network security: SOFAMesh

At present, the second largest technology in cloud native besides containers is Service Mesh. From the practice of ants, it is actually very helpful for financial security. It can do at least three things:

  • Strategic and efficient flow control helps O&M adapt to rapid business changes.
  • Full-link encryption protects end-to-end data security;
  • Traffic hijacking and analysis. When abnormal traffic or containers are detected, traffic is blocked.

In addition, this work is transparent to the business, without the burden of business development, and we can also do real-time semantic analysis of traffic and so on, which can do more than traditional firewalls.

Ant Financial launched its own SOFAMesh made with Golang in the exploration of Service Mesh, and has opened its source to the outside world. It hopes to work with the community to popularize the concept and technology of Service Mesh.

SOFAMesh is a large-scale implementation solution of Service Mesh based on Istio. On the basis of inheriting the powerful functions and rich features of Istio, the improvements made include replacing Envoy with SOFAMosn written by Golang, which greatly reduces the difficulty of developing Mesh itself, in order to meet the performance requirements of large-scale deployment and meet the actual situation in landing practice. Some innovative work was done, such as merging Mixer into the data plane to solve performance bottlenecks, enhancing Pilot for a more flexible service discovery mechanism, adding support for SOFARPC, Dubbo, etc.

More details can be found on SOFAMesh’s GitHub page at github.com/sofastack/s…

Ant Financial was the first to implement SOFAMesh on a large scale in the production environment. More than 10W+ containers were meshed, and 618 containers were stably supported. It brought us many benefits such as multi-protocol support, UDPA, smooth upgrade, and security, with only slight impact on performance. When single-hop CPU increases the loss by 5%, RT increases by less than 0.2ms. Even when some services are Mesh transformed to sink the service link, RT decreases by 7%.

Safe Containers: Kata Containers

As you can see from the diagram above, our applications share the same CPU, memory, network, and storage, but they look different from the outside. This can lead to security issues, where there is no real isolation between different containers, and once a security problem occurs in one container, it is likely to affect other containers, or even the entire system. Ant Financial is doing this with secure Containers, specifically Kata Containers.

Kata Containers Secure Containers is the OpenStack Foundation’s top open infrastructure project, co-led by Ant Financial and Intel. In a secure container, each Pod runs in a separate sandbox and does not share a kernel with each other, providing strong security. Here is to share with you the recent progress of Kata Containers, which has been greatly improved for the most concerned performance issues:

  • Shimv2 was introduced to reduce the number of helper processes per Pod from 2N+2 to 1.
  • Virtiofs improves file system performance by 70% to 90%.
  • Introduction of Firecracker reduces VMM memory overhead from 60MB to about 15MB;
  • With rust agent, the memory usage decreased from 11MB to about 1MB.

We will also continue to build Kata Containers with the community to make secure Containers standard in the cloud.

The secure container can effectively protect the host, but the financial business itself still needs stronger isolation protection. Ant Financial introduces confidential computing and develops a large-scale landing solution SOFAEnclave according to actual scenarios.

Confidential computing middleware: SOFAEnclave

Confidential computing, based on Trusted Execution Environment (TEE) such as Inte SGX, ARM Trustzone, or Enclave, isolates user data when accessing computer memory. To avoid exposing data to other applications, operating systems, or other cloud server tenant solutions.

For example, when your financial business runs on the Enclave, the operating system does not see the memory inside the Enclave and checks the integrity of the Enclave to ensure that the code accessing the Enclave is not replaced.

However, Enclave has some problems which hinder its application in actual production environment. These issues include:

First, it is necessary to rewrite the application, because there is no kernel and base library in the trusted execution environment, so it cannot execute the application directly in the Enclave. Second, the application needs to be split, the business program needs to be divided into Enclave and Enclave parts; Third, it is not clustered. Unlike the client scenario, failover and DISASTER recovery of Enclave applications are also reasons preventing it from being used on a large scale in data centers.

Ant Financial’s answer to these questions is its SOFAEnclave secret computing middleware.

SOFAEnclave consists of three components, the first is Occlum LibOS, the other is SOFAst, and KubeTEE. Occlum is a memory-safe multi-task Enclave kernel developed by Ant, Intel and Tsinghua, which links the functions of the system kernel through the way of lib and adds functions to the Enclave in this way. We have also come up with innovative solutions to running multiple processes in Enclave, making Enclave truly suitable for large applications.

To learn more about the technology behind SOFAEnclave, check out this article: SOFAEnclave: Ant Financial’s Next Generation Trusted Programming Environment for Secret Computing to Protect Financial Services for 102 Years

SOFAEnclave open source components Occlum making homepage: https://github.com/occlum/occlum

When we weave these security components together with the cloud native framework to form a panorama, we are building a secure cloud native security architecture for financial services — based on Ali Cloud and Kubernetes to guarantee financial services with end-to-end security.

Some of these components are open source and developed by Ant Financial with partners and the community after practice testing, and some have been developed in the community since the beginning. Different from the technology development in the traditional financial industry, we advocate the establishment of an open architecture, and believe that open and open source governance is indispensable to this architecture. We will continue to participate in and support the open development of the community, and work with the community to create the next generation of financial level cloud native technology.

Extension: Ant Financial’s contribution in cloud native field

SOFAMosn

Github.com/sofastack/s…

SOFAMosn(Modular Observable Smart Network) is a Modular Service Mesh data plane agent developed by GoLang. It aims to provide distributed, Modular, Observable and intelligent agent capabilities for services. SOFAMosn integrates with SOFAMesh through XDS API, and can be used as an independent layer 4 and 7 load balancer. In the future, SOFAMosn will support more cloud native scenarios and the core forwarding function of NGINX.

Ant Financial has completed the verification of SOFAMosn on its core system on June 18 this year. In the upcoming November 11, Alibaba and Ant Financial will launch Service Mesh in the core system on a large scale.

ElasticDL

elasticdl.org

ElasticDL is a new generation of cloud native open source AI learning platform released by Ant Financial. Its architecture is based on the native Kubernetes system, so it has strong fault tolerance and flexible scheduling capabilities. ElasticDL also supports the next generation of TensorFlow 2.0 framework, which is expected to lead AI developers to the next generation of machine learning.

In the future, ElasticDL will support more AI models, making it more powerful and better integrated with cloud native and Kubernetes.

Financial Class Distributed Architecture (Antfin_SOFA)