Abstract: From April 24 to 26, HDC.Cloud2021 was successfully held in Shenzhen University Town. Huawei cloud FusionInsight MRS cloud native data Lake brings the most knowledgeable big data solution, providing government and enterprise customers with an integrated lake-warehouse, cloud-native big data solution. Offline data lake, real-time data lake, logical data lake, more on-site sandbox experiment and master lecture hall, together to fine taste.

This article is shared with huawei Cloud community “HDC.Cloud2021 Review of Huawei Cloud FusionInsightMRS Cloud Native Data Lake highlights”, originally written by Hourglass.

On April 26, 2021, the HDC.Cloud2021 (Huawei Developer Conference 2021) was successfully concluded. In this conference, Huawei cloud FusionInsight MRS cloud native data Lake came to the scene with the vision and mission of “one architecture, three lakes”. Together with many industry customers, partners and developers, we discussed how to use better technology innovation in the rapid development of 5G, AI and IoT. Empower thousands of lines and hundreds of industries and other topics for in-depth discussion. Below, let’s relive the wonderful moments of this activity.

Data Enabling Area Huawei Cloud FusionInsight MRS Cloud native data lake shines brightly

Huawei cloud FusionInsight MRS one goku Sanhu

In the data Enabling area, Huawei FusionInsight MRS cloud native data lake brings the most knowledgeable big data solution, providing government and enterprise customers with a lake-warehouse integrated, cloud-native big data solution. One architecture can build three data lakes: Offline data, real-time data lake, lake lake logical data, support enterprise customers all measured data in real time, offline analysis, real-time interactive query, retrieval and multimode analysis, data warehouse, data access and manage large data application scenarios, such as the number of enterprise customers efficient use, simplified use number, help enterprise customers realize a enterprise with a lake, a lake city, business insights, more accurate, Value is realized faster.

** Offline data lake: **HetuEngine provides second-level interactive query capabilities, data does not leave the lake, analysis links are short, performance is 30% faster than Impala, and analysis performance is 10 times better than Impala. DLC provides unified metadata and makes data globally visible. HetuEngine provides unified SQL interfaces in the lake, such as HDFS, Hive, HBase, and ES, simplifying usage.

** Real-time data lake: ** stream processing + Hudi to update data into the lake, from T+1 to T+0; ClickHouse provides real-time OLAP analysis in milliseconds; Flink provides FlinkSQL capabilities, batch SQL interface unification, to achieve stream batch integration.

** Logical data Lake: **HetuEngine provides unified access across lakes, warehouses and clouds, reducing data relocation, efficient data flow, second-level collaborative analysis and second-level response of data in the whole domain, and business online efficiency is increased by 10 times, from weekly to daily.

Huawei cloud FusionInsight MRS implements industry-university-research cooperation, comprehensively promotes the development of big data open source technology, and jointly releases the IoTDB timing engine version with Tsinghua University. At present, Huawei Cloud FusionInsightMRS has been applied to more than 3,000 customers in more than 60 countries, helping government and enterprise customers realize one lake for one enterprise, one city for one lake, with more accurate business insight and faster value realization.

Huang Haoxi, huawei cloud FusionInsight technical engineer, explains the experiment

On one side of the booth of Huawei FusionInsight MRS cloud native data lake, there is a sandbox laboratory operated by developers. Huang Haoxi, senior engineer of Huawei FusionInsight Technology ecology, appeared on the scene to teach everyone. There are three experiments “Using MRS Hudi to experience real-time access to the lake, using MRS Clickhouse to experience real-time OLAP, and using MRS HetuEngine to experience cross-source and cross-domain analysis ability”. Through hands-on practical experience, you can deepen understanding of the characteristics of each component. Hudi can support incremental data update. From the traditional Append to Upsert, realize real-time data update, data value release from T+1 to T+0; Clickhouse provides millisecond OLAP analytics to prevent data from leaving the lake, eliminating traditional data redundancy and moving back and forth. HetuEngine provides unified standard SQL for efficient access to multiple data sources distributed in multiple regions (or data centers), masking data structure, storage, and geographic differences, and decoupling data from applications.

Huawei cloud FusionInsight MRS Cloud native Data Lake Exhibition area

The exhibition area not only has the characteristics of three lakes of Huawei FusionInsight MRS cloud native data lake, but also has the sandbox experience of practical application, allowing guests to gain first-hand experience of big data cutting-edge technology while getting hands-on operation.

Lecture hall: Talk about new technology, new value and new trend

During the conference, a series of special lectures titled “Master Lecture Hall” created by Huawei technical experts Tiantuan discussed the value of technological innovation and shared innovation practices around topics such as cloud native, big data and artificial intelligence. Among them, Huawei Cloud FusionInsightMRS Cloud native Data Lake brought two expert lectures. Xu Tianli, architect of Huawei cloud FusionInsight solution, shared the topic of “How to Upgrade the Big Data Cluster without service interruption”. Wu Wenbo, architect of HetuEngine, Share the topic “Minute-level Analysis of Massive Data in Cross-source and Cross-domain Scenarios”.

The big data cluster of 1000 – level nodes is continuously upgraded without service interruption

Xu Tianli, an architect of huawei cloud FusionInsight solution, gives a speech

With the digitalization development of government and enterprises, data lakes are carrying more and more critical data analysis and processing businesses in the government, finance, operators and large enterprises, etc. In the process of daily upgrade and maintenance, the requirements for business continuity guarantee are getting higher and higher.

Iteration is fast and big data technology, the traditional big data platform adopts the offline upgrade method, need power, restart operations, such as upgrading complex operation, operational trival, affect network business operation, and large cluster upgrading time-consuming, abrupt fault interrupt upgrading action, to maintain the business continuity and technology lead, be badly in need of business uninterrupted rolling upgrade ability, Ensure the continuous evolution of the large cluster data base.

Huawei FusionInsightMRS Cloud native data Lake provides a super large cluster with a capacity of 20,000 + nodes in a single cluster and unlimited expansion in the federation. Starting from the standard configuration of a 500+ node cluster, Huawei FusionInsightMRS cloud native data Lake provides the rolling upgrade capability. The upgrade success rate is 100% so far.

With the rolling upgrade capability of huawei FusionInsight MRS cloud native data lake, government and enterprise customers can realize the rolling upgrade in large clusters in batches and in cycles, with 0 service interruption. The failure node isolation function ensures the stable operation of the upgrade operation and realizes 7*24 hours uninterrupted service; 1000+ Refined O&M indicators and visual operations simplify o&M and realize continuous evolution of an architecture.

Massive data across lakes and warehouses minute-level analysis

HetuEngine architect Wu Wenbo gave a speech

HetuEngine is a unified and efficient data virtualization analysis engine that seamlessly integrates with big data ecology to achieve massive data second-level query. The industry’s first multi-source heterogeneous collaborative, one-stop SQL fusion analysis, massive data collaborative analysis minute level.

** High performance interactive query: ** Traditional big data using Hive engine to build AD hoc query tasks, query time is long, HetuEngine using heuristic indexing and execution plan Cache to achieve second query response;

** Cross-lake, cross-warehouse, cross-cloud integration: ** Traditional data analysis needs to unify data format first, HetuEngine can realize join between different data formats, reduce data relocation, and improve efficiency by 30% compared with traditional solutions; Traditional DC analysis to build manual ferry data, HetuEngine can be connected by DC Connector, data globally visible, collaborative time from days to minutes;

** Multi-engine integration: ** Traditional big data in multi-engine component development, need to involve multi-component customized development, HetuEngine can unified SQL interface access to big data, reduce the number of thresholds, development efficiency 2-10 times.

conclusion

Curtain down, is not the end, but the beginning of a new journey. Huawei FusionInsight MRS cloud native data lake will stay true to its original mission, forge ahead, maintain the driving force of technological innovation, and expand the black land of the digital world. Together with 800+ISV, huawei FusionInsight MRS cloud native data Lake will provide customers with an integrated solution of continuous evolution of lake and warehouse, which can realize offline data lake, real-time data lake, and logical data lake in one architecture. In thousands of lines and industries to build “a lake, a city, a lake”.

Click to follow, the first time to learn about Huawei cloud fresh technology ~