1. Scenario description

Customers have bought a lot of Ali Cloud products, but Ali Cloud is not responsible for the implementation, based on Ali cloud products and customer needs, the proposed data platform architecture, have similar needs, you can refer to, take away no thanks!

2. Solutions

Big data Architecture diagram of Ali products:

From bottom to top, we will briefly introduce the functions and tasks of each Ali product:

2.1 Cloud Database RDS

Relational Database Service (RDS) is a stable, reliable, resilient and scalable online Database Service. It provides a full set of solutions for disaster recovery, backup, recovery, migration, and so on, completely solving the problems of Database operation and maintenance

If you think the article is helpful to you, welcome to wechat search "software Lao Wang" for the first time to read or communicate!Copy the code

2.2 Data Transmission DTS

Data Transmission Service (DTS) Supports Data Transmission between Data sources such as relational databases, NoSQL, and OLAP. It is a data transmission service integrating data migration, data subscription and real-time data synchronization. Data transmission Aims to solve the problem of long-distance and millisecond asynchronous data transmission in public cloud and hybrid cloud scenarios. Its underlying data flow infrastructure is alibaba Double 11 remote Live infrastructure, which provides real-time data flow for thousands of downstream applications. It has been running stably online for 6 years. You can easily build secure, scalable, and highly available data architectures using data transport.

2.3 Offline Data Synchronization Tool DataX

DataX is a widely used offline data synchronization tool/platform within Alibaba Group. Implement efficient data synchronization among heterogeneous data sources including MySQL, Oracle, SqlServer, Postgre, HDFS, Hive, ADS, HBase, TableStore(OTS), MaxCompute(ODPS), AND DRDS.

Open source: github.com/alibaba/Dat…

2.4 DataHub

Alibaba Cloud Streaming Data processing platform DataHub is a processing platform of Streaming Data, providing publishing, Subscribe and distribution functions of Streaming Data, so that you can easily build analysis and applications based on Streaming Data. DataHub services can continuously collect, store and process a large amount of streaming data generated by various mobile devices, applications, web services, sensors, etc. Users can write application programs or use streaming computing engine to process the streaming data written to DataHub, such as real-time Web access logs, application logs, various events, and produce various real-time data processing results, such as real-time charts, alarm information, real-time statistics, etc.

DataHub service is based on the Flying platform developed by Ali Cloud, featuring high availability, low latency, high scalability and high throughput.

2.5 the ADB or ADS

AnalyticDB MySQL edition (referred to as ADB, the original analytical database MySQL edition) is a cloud computing service independently developed by Alibaba for real-time and high-concurrency online analysis of massive data, enabling you to carry out real-time multidimensional analysis perspective and business exploration for billions of data in milliseconds.

2.6 What is MaxCompute

Big Data Computing Service (MaxCompute, formerly ODPS) is a fast, fully managed EB-level data warehouse solution.

With the increasing enrichment of data collection means and the accumulation of industrial data, the data scale has grown to the level of massive data (100 TB, PB, EB) that cannot be carried by the traditional software industry. MaxCompute is dedicated to the storage and calculation of batch structured data, providing massive data warehouse solutions and analysis and modeling services.

2.7 Intelligent data construction and management of Dataphin

For large data construction, management and application in all walks of life appeal, one-stop consumption provides access to the data from the data link of intelligent data construction and management of large data capacity, including product, technology and methodology, etc., help to build a standard unified service, achieve mastery through a comprehensive, capitalization, and the closed loop since the optimization system of intelligent data, in order to drive innovation.

A distributed network of servers in a domain. Source station resources are cached to edge servers all over the country for users to obtain nearby, reducing the source station pressure.

2.8 Server ECS

The Elastic Compute Service (ECS) is a simple, efficient and flexible computing Service. Help you build more stable and secure applications, improve operation and maintenance efficiency, reduce IT costs, and enable you to focus on core business innovation.

If you think the article is helpful to you, welcome to wechat search "software Lao Wang" for the first time to read or communicate!Copy the code

2.9 Real-time stream Processing Blink

A one-stop, high-performance real-time big data processing platform based on Apache Flink is widely used in streaming data processing, offline data processing, DataLake calculation and other scenarios.

In January 2019, Alibaba Cloud officially announced that it would open source its real-time computing platform Blink, which was inherited from Flink open source framework, which was first applied to data processing in Internet scenarios with small traffic. Alibaba has revamped Flink and launched an internal version of Blink, which reduces computing latency to milliseconds.


More knowledge, please pay attention to the public number: “software Lao Wang”, IT technology and related dry goods to share, reply keywords to obtain the corresponding dry goods, Java, send 10 copies of the “martial arts secrets”; Pictures, more than 1 million commercially available hd pictures; Interview, send just graduated can monthly salary “20K” Java interview questions.