Brief introduction: EDC is built on Cloudera Data Platform(CDP), which combines the benefits of Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise. New features are added to the technology stack and enhancements to existing technologies are provided. This unified release is an extensible and customizable platform on which you can safely run multiple types of workloads.

Overview of enterprise data clouds

What enterprises demand from big data solutions: the ability to capture and combine any amount or type of data in one place, with raw fidelity, whenever necessary, and to provide insights to all users as quickly as possible.

Cloudera introduced the concept of the Enterprise Data Cloud (EDC) : data-driven enterprises need to be able to apply multiple analytics rules to data that is everywhere; The ability to stream and process real-time data from multiple edge endpoints while predicting key results and applying machine learning techniques on the same data set; Leveraging the agility, flexibility, and increasing data gravity of the public cloud infrastructure; In addition, all of this can be done on an open platform, where data security and governance can be applied wherever data is stored and analytics run. This is what the industry calls the enterprise data cloud.

EDC has the following characteristics:

  • Hybrid cloud and multi-cloud support: Provides options to manage, analyze, and experiment with data in any public cloud and private data center for maximum choice and flexibility.
  • Multifunctional analytics: Addressing the most demanding business use cases — applying real-time flow processing, data warehousing, data science, and iterative machine learning across shared data at scale.
  • Security and governance: Data on any cloud (public, private, and hybrid) is controlled through a common security model, simplifying data privacy and compliance for all types of enterprise data.
  • Openness: Foster innovation in the open source community, provide options for open storage and computing architectures, and foster confidence and flexibility in a broad ecosystem

Enterprise data cloud platform not only provides the security of the enterprise and management ability, also offers a variety of functions for data analysis at the same time, the ability to have the same internal and external deployment function, support the major public and private cloud environment, allows the user to obtain the elastic cloud experience, and no more data island and the threat of a single vendor lock-in.

EDC is not only the flexibility to run various enterprises workloads (for example, in real time to absorb and analysis, data engineering, interactive SQL, enterprise search, advanced analysis and machine learning), also meet the requirements of enterprise: integration with enterprise existing systems, at the same time provide a powerful security, data management, data protection and management ability. EDC is the emerging center of enterprise data management.

CDP Platform Introduction

EDC is built on Cloudera Data Platform(CDP), Cloudera’s latest product, The new product combines the best of Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise, adding new features to the technology stack and providing enhancements to existing technologies. This unified release is an extensible and customizable platform on which you can safely run multiple types of workloads.

In addition to the need for an enterprise data cloud, companies want to migrate or add this powerful data management infrastructure to the cloud to improve operational efficiency, reduce costs, provide computing and capacity flexibility, and speed and agility.

As organizations adopt Hadoop-based big data deployments in cloud environments, they also need enterprise-level security and governance, multiple analytics capabilities, management tools, and technical support — all of which are part of the CDP platform, which is illustrated in the map below.

CDP supports a variety of hybrid solutions where computing tasks are separated from data storage and data can be accessed from remote clusters. This hybrid approach provides the foundation for containerized applications by managing storage, table schemas, authentication, authorization, and governance.

CDP includes various components, such as Apache HDFS, Apache Hive 3, Apache HBase, and Apache Impala, as well as many other components for special workloads. You can choose any combination of these services to create a cluster that meets your business needs and workload. Several pre-configured service packs are also available for common workloads.

An overview of the Cloudera Manager

Cloudera Manager is an application for managing, configuring, and monitoring the CDP cluster and Cloudera Runtime services.

The Cloudera Manager server runs on hosts in a CDP deployment and manages one or more clusters using the Cloudera Manager agent that runs on each host in the cluster.

Cloudera Manager is an end-to-end application for managing clusters. With Cloudera Manager, you can easily deploy and centrally operate the complete Cloudera Runtime stack and other managed services. The application automates the installation and upgrade process and gives you a real-time view of your entire cluster of hosts and running services. The Cloudera Manager administrative Console provides a central console where you can configure an entire cluster

And combines various reporting and diagnostic tools to help you optimize performance and utilization. Cloudera Manager also manages security and encryption capabilities. Using the Cloudera Manager administrative console, you can start and stop clusters as well as individual services, configure and add new services, manage security, and upgrade clusters. You can also use the Cloudera Manager API to perform administrative tasks programmatically.

A single instance of Cloudera Manager can manage multiple clusters, including older versions of Cloudera Runtime and CDH.

Cloudera Runtime

Cloudera Runtime is the core open source software distribution in CDP Private Cloud Base. Cloudera Runtime includes about 50 open source projects that form the core distribution of data management tools in CDP. The Cloudera Runtime component is documented in this library.

tool

CDP also includes the following tools to manage and secure your deployment:

  • Cloudera Manager allows you to manage, monitor, and configure clusters and services using Web applications from the Cloudera Manager administration console or the Cloudera Manager API.
  • Apache Atlas provides a set of metadata management and governance services that enable you to manage CDP cluster assets.
  • Apache Ranger manages access control through a user interface to ensure consistent policy management across CDP clusters.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.