The rapid and efficient expansion of scale enables the rapid processing of ever-increasing capacity, speed and diverse data.
DKH effectively integrates all components of the entire HADOOP ecosystem, and is deeply optimized and recompiled into a complete and higher performance general computing platform for big data, realizing the organic coordination of all components. Therefore, COMPARED with open source big data platform, DKH has up to 5 times (maximum) performance improvement in computing performance.
DKH simplifies the complex big data cluster configuration to three types of nodes (master node, management node and computing node) through the unique middleware technology of Dakuai, which greatly simplifies the management operation and maintenance of the cluster and enhances the high availability, high maintainability and high stability of the cluster.
DKH, although highly integrated, still retains all the advantages of the open source system and is 100% compatible with the open source system. Big data applications developed based on the open source platform can run efficiently on DKH without any changes, and the performance can be improved by up to 5 times.
Google solved this problem using an algorithm called MapReduce. The algorithm divides tasks into smaller pieces, distributes them to multiple computers, and collects and synthesizes results from these machines to form a result data set.
Hadoop
Using a solution provided by Google, DougCutting and his team developed an open source project called HADOOP.
The MapReduce algorithm used by Hadoop runs where the data is being processed in parallel with other applications. In summary, Hadoop is used to develop applications that can perform complete statistical analysis of big data.