Dynamic description of Aliyun e-MapReduce
E – graphs team
Version 1.4 (released)
- Job failure alarm
- Concurrent job submission
- Add sqoop, shell type jobs
Version 1.4.1 (under development)
- Perfect failure alarm
- Improve scheduled tasks and add hourly and minute scheduled tasks
Version 1.5.0 (in development)
- A dashboard of the overall cluster performance
- Cluster status monitoring alarms
1.6.0 version
- Interactive query (Support Hive and Spark)
information
Apache Spark 2.0.0, APIs updates This release includes APIs, SQL 2003 support, R UDF support, and performance enhancements.
Spark2.0 unites the streaming and Batch apis, introduces the dataset, and optimizes performance such as Tungsten to make Spark a better distributed computing engine.
Spark analysis and MongoDB The rapid deployment of MongoDB as a database service is the main reason for its popularity. At its annual conference, the company behind the NoSQL database showed off a number of improvements, including interconnection with Spark analytics.
It feels like flying! When Spark encounters Redis, some in-memory data structures are more efficient than others; If you take full advantage of Redis, Spark runs faster.
To make the impossible possible, Tachyon helps Spark scale tasks from hour to second. Tachyon helps you keep this data in memory for a long time and share it between different applications.
This article describes how to use the DataSource API of Apache Spark to mix multiple data sources.
Machine Learning for Big Data processing with Spark discusses machine learning concepts and how to use Spark MLlib for predictive analytics.
This paper first introduces the design principle of Streaming processing framework and the working principle of Spark Streaming. Then, an example of reading, analyzing and writing pictures based on Spark Streaming is provided to help readers understand the working principle of Spark Streaming.