Skip to content

Home
Java
JavaScript
Front-end
Python
PHP
IOS
MySQL

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Home
Java
JavaScript
Front-end
Python
PHP
IOS
MySQL

Hadoop overview

December 21, 2023

by Catherine Mann

No Comments

1. Introduction and development history of Hadoop

The article directories

1. Introduction and development history of Hadoop

1.1 In a narrow sense, Hadoop refers to an open source software of Apache.
2.1 Hadoop Core Components
3.1 website: https://hadoop.apache.org/
4.1 In a broad sense, Hadoop refers to the big data ecosystem built around Hadoop.
5.1 History of Hadoop
6.1 summarize

2. Hadoop features advantages and domestic and foreign applications

2.1 Hadoop Features Advantages
2.1 Hadoop Application abroad
2.2 Hadoop Domestic Application
2.3 summarize

3. Hadoop distribution and architecture changes

3.1 Hadoop Distribution
3.2 Hadoop distribution

4. Hadoop Architecture Changes (1.0-2.0 changes)

5. Hadoop Architecture Changes (new 3.0)

1.1 In a narrow sense, Hadoop refers to an open source software of Apache.

Implement open source software frameworks in the Java language
Allows distributed processing of large data sets across clusters of computers using a simple programming model

2.1 Hadoop Core Components

Hadoop HDFS(Distributed File Storage System) : Provides massive data storage
Hadoop YARN (Cluster Resource Management and Task Scheduling Framework) : Solve resource task scheduling
Hadoop MapReduce (Distributed Computing Framework) : Solves massive data computing

3.1 website:hadoop.apache.org/

4.1 In a broad sense, Hadoop refers to the big data ecosystem built around Hadoop.

5.1 History of Hadoop

Father of Hadoop :Doug Cutting
Hadoop originated from the Apache Lucene subproject :Nutch Nutch was designed to build a large, full-web search engine. Bottlenecks: How to solve the storage and indexing of billions of web pages
Google three papers

The Google File System: Google Distributed File System GFS
MapReduce:Simplified Data Processing on Large Clusters: Google Distributed Computing Framework
MapReduce Bigtable: A Distributed Storage System for Structured Data: Google Structured Data Storage System

Three papers in Chinese download address: download.csdn.net/download/qq…

6.1 summarize

In the narrow sense, Hadoop refers to software and in the broad sense, Hadoop refers to ecosystems
Doug Cutting, the father of Hadoop
Hadoop has its roots in the Nutch project
Inspired by 3 papers on Google
Open source to the Apache Software Foundation in 2008

2. Hadoop features advantages and domestic and foreign applications

2.1 Hadoop Features Advantages

2.1 Hadoop Application abroad

2.2 Hadoop Domestic Application

2.3 summarize

The magic of Hadoop’s success — versatility — The precise distinction between what you do and how you do it is a business problem and how you do it is a technical problem. The user takes care of the business and Hadoop takes care of the technology
The appeal of Hadoop’s success — simplicity

3. Hadoop distribution and architecture changes

3.1 Hadoop Distribution

3.2 Hadoop distribution

Apache Open Source community version: hadoop.apache.org/
Commercial distribution Cloudera: www.cloudera.com/products/op… Hortonworks : https://www.cloudera.com/products/hdp.html

The latest version is:3.2.2

4. Hadoop Architecture Changes (1.0-2.0 changes)

Hadoop 1.0

HDFS(Distributed file storage)

MapReduce(Resource Management and Distributed Data Processing)
Hadoop 2.0 HDFS (distributed file storage) MapReduce(distributed data processing) YARN (Cluster resource management and task scheduling)

5. Hadoop Architecture Changes (new 3.0)

Hadoop 3.0 architectural components are similar to Hadoop 2.0 in that 3.0 focuses on performance optimization.
General compact kernel, classpath isolation, shell script refactoring
Hadoop HDFS EC erasure codes and multiple NameNode support
Localization optimization of Hadoop MapReduce tasks and automatic inference of memory parameters
Hadoop YARN Timeline Service V2 and queue configuration

java

Share :

Related Posts

Talk about the core modules of Spring

December 29, 2023

Convolution: How to Make a Great Neural network

February 2, 2024

JVM parameter Xms Xmx PermSize MaxPermSize difference

January 5, 2024

Recent Posts

Sentry(V20.12.1) K8S cloud native architecture exploration, Sentry FOR JAVASCRIPT manual capture of event basic usage
AspectJ implementation AOP
Wechat mini program imitates netease Cloud Music related real-time search function (details + optimization)
Leaf-server official tutorial & Game engine source code practices (June 21 – Weekly RoadMap post)
Python3 Tkinter Base Radiobutton Indicatoron changes the appearance of buttons to round/square

Tags

algorithm android api architecture Artificial intelligence (ai) c++ css Deep learning Design patterns docker flutter github Go ios java javascript jvm leetcode linux Machine learning mysql node.js Open source operations product python react.js redis security spring Spring Boot swift test The back-end The database The front end The interview The programmer The server The source code typescript vue.js webpack WeChat Wechat applets

Privacy PolicyHome

© Copyright Mo4Tech 2021