Abstract:

Dynamic description of Aliyun e-MapReduce

  • Public testing of a pB-level distributed database supported by Ali Cloud HBase is about to start

information

  • Fu Zhihua, deputy general manager of 360 Big Data Center: Ten big data development trends in 2017

In 2016, big data has shifted from the stage of expectation expansion and hype in the previous two years to the stage of rational development and practical application. In 2017, big data is still in a period of rational development, and there are still many challenges, but the outlook is still very optimistic.

  • Oracle enters cloud Computing, disrupter or New Player?

Oracle, as a traditional IT manufacturer, saw itself gradually crushed, and not to be outdone, began to accelerate the transformation and actively participate in cloud computing. At the recent “Cloud World” event in New York, Oracle executives directly laid out their vision for how they will compete with the likes of Amazon, Microsoft, and Salesforce in an effort to catch up with these cloud giants.

  • Snap paid $2 billion for Google’s cloud infrastructure services

On February 2nd Snap published a white paper on its IPO, which it hopes will fetch a valuation of $25 billion when it debuts on the New York Stock Exchange. Notably, Snap disclosed in its S1 filing that it will spend a total of $2 billion over the next five years to buy Google’s cloud infrastructure services.

  • China will build its first national engineering laboratory for big data circulation and transaction technology

China’s first Big data circulation and trading technology National Engineering Laboratory has been officially approved by the National Development and Reform Commission, and will be jointly built by Inspur Group and Shanghai Data Exchange Center. This is inspur’s second national engineering laboratory after the National Engineering Laboratory for Mainframe Systems.

technology

  • Use Phoenix to update HBase data using SQL statements

HBase provides convenient shell scripts to perform CURD operations on data tables. However, it costs a lot of learning costs. The Apache Phoenix component converts SQL statements into the native API of hbase. In this way, HBase data can be managed using common SQL, greatly reducing the cost. Phoenix provides high performance compared with HBase native SCAN, and significantly improves the performance of similar components such as Hive and Impala.

  • Apache Flink 1.2.0 is released and its features are introduced

The long-awaited Apache Flink 1.2.0 is officially released today. This version resolves a total of 650 issues. Key features and changes include: support for modifying the parallelism of a job from savepoints with different parallelism; Support for Mesos resource scheduler; Support for asynchronous I/O operators; Support operator state query and so on.

  • Hadoop 3.0 arrives, new version introduction and future direction

Over the past decade, Apache Hadoop has evolved from a theoretical concept from scratch to support some of the world’s largest production clusters. Over the next decade, Hadoop will continue to grow and evolve to support a new wave of larger, more efficient, and stable clusters. We’ll take you through the upcoming Release of Apache Hadoop 3.0 — from the release status, the story behind it, to new features like HDFS Erasure Coding, YARN Federation, NN K-Safety, and more.

  • Use Apache Spark for training large language models

Apache Spark is a fast and versatile engine for large-scale data processing. It runs on Hadoop and Mesos and can run offline or in the cloud. It is fast and scalable. Spark has been used more and more in recent years, driven by large companies like IBM and numerous community contributors. Today, the Facebook team also demonstrated their approach to using Apache Spark for large language model training.