HDFS distributed file system

Basic introduction

composition

Client
- Storage after chunking
- Interacts with the NameNode to get location information for the file
- Interacts with the DataNode to read or write data
- Client provides several commands to manage and access HDFS
NameNode: It’s a master. It’s a master
- Manage HDFS namespaces
- Manage the mapping information of the data block, but do not persist, too large
- Configure replica policy
- Handle client read and write requests
DataNode: Executes NameNode directives
- Stores the actual data block
- Performs read and write operations to data blocks
snn
- Auxiliary nn
- Merge fsimage and fsedits periodically and push to nn
- In case of emergency, NN can be restored for one hour ago data
To solve the problem of mass storage, cross-machine storage, unified management of the file system distributed on the cluster is called distributed file system
Suitable for storing large data, provide unified access interface, can operate like ordinary file system
Distributed storage scheme
- Split large files into chunks and place them on different servers
No matter how big the file is, it will be stored in blocks. Each block has 128M, and each block has a copy. The default copy of each block is 3
Metadata is stored in NameNode memory, and there is a backup of the metadata on disk
Design goals
- Hardware fault handling: detection and quick recovery is the core
- Streaming read data: suitable for batch processing, rather than interactive, focusing on high throughput data access
- The file size should be as large as possible. Small file metadata takes up memory
- The cost of mobile computing (computing in HDFS) is lower than the cost of moving data (pulling data to the local area)
Application scenarios
- GB, TB, PB basic data storage, batch high throughput
- One write multiple reads, no modification necessary, HDFS does not support random modification of file data, such as the fact that history has evolved (weather)
- Can run on ordinary cheap machines
- High fault tolerance
- Ability to expand
Inapplicable scenario
- Randomly modify the scene,
- Interactive scenes
- Scenarios for storing large numbers of small files

architecture

Heartbeat Mechanism: Hdfs-default.xml: If the DataNode does not send a heartbeat in a certain amount of time and waits 10 times, the Node will be able to send a heartbeat to the NameNode every 3 seconds. If the default configuration does not receive a heartbeat for 30 seconds, NameNode will consider it as suspended death. NameNode will actively send a check to the DataNode every 5 seconds. If it does not receive a heartbeat after two checks, it will consider it as a failure. The default configuration is 630 and the DataNode is determined to be dead. In addition, the DataNode will report its block information to the NameNode every six hours

<property> <name>dfs.heartbeat.interval</name> // <value>3</value> <description>Determines datanode heartbeat interval in seconds.</description> </property> <property> <name>dfs.namenode.heartbeat.recheck-interval</name> <value>300000</value> <description> This time decides the interval to check for expired datanodes. With this value and dfs.heartbeat.interval, the interval of deciding the datanode is stale or not is also calculated. The unit of this configuration is millisecond.  </description> </property>

Load balancing: The NameNode ensures that the number of blocks in each DataNode is approximately the same

start-balancer.sh -t 10%

Replica mechanism: Default 3 copies per block. When it is found that there are less than 3 copies of some blocks, it will ask the specified node to create a copy and ensure that there are 3 copies. If the copy data is more than 3, it will ask the specified node to delete the extra copies. If you can’t handle a copy (more than half of the machine is dead), then NameNode will put HDFS in safe mode and won’t write. Each DataNode will report block information to NameNode. NameNode will check the integrity of the data and exit the security mode
- Matters needing attention
  - When storing, the block size is 128M, but when the block size is smaller than 128M, it is saved as the actual size

Replica storage (rack aware)

- The first copy is stored on any machine on the rack nearest to the client, if equal, select a rack at random - The second copy is stored on a server different from the first one - The third copy: on a server different from the second copy on the rack

HDFS operation

HDFS DFS -put localpath hdfspath to upload to HDFS
HDFS DFS -moveFromLocal LocalPath hdfspath
HDFS DFS -get hdfspath localpath from HDFS
HDFS dfs-getMerge hdfspath1 hdfspath2 localPath is downloaded from HDFS and merged to the local
HDFS DFS-RM [-R-F-SkipTrash] XX data moved to the trash can with storage time
HDFS dfs-du [-h] file
HDFS dfs-chown [-r] User: User group file
HDFS DFS -appendToFile srclocalFilePath hdfsFilePath append file

HDFS senior shell

hdfs dfs -safemode get/enter/leave
- Supplementary security mode: A protection mechanism of Hadoop. At this time, the data integrity will be checked. The default copy rate is 0.999, and the DataNode will delete the redundant copies and only accept read requests
- If there is a missing block and HDFS can not actively exit the safe mode, manually leave the safe mode, can be deleted, 1. Files not important skip the recycle bin all dry; 2. If the file data is important, download the file to the local disk, delete the corresponding HDFS file by skipping the recycle bin, and then upload it again
Benchmark test, test the overall throughput of the HDFS cluster that has just been built, take the maximum value, and take the average for several tests of different parameters

Hadoop jar/export/server/hadoop - 2.7.5 / share/hadoop/graphs/hadoop - graphs - the client - jobclient - 2.7.5. Jar TestDFSIO -write -nrFiles 10 -fileSize 10MB

HDFS principle

The NameNode stores the metadata process

The entire system’s namespace and address map of data are held, so the entire system’s storage capacity is limited by the memory size of the NameNode
When the NameNode records metadata information, it first records it to the Edits and then to the memory space, until it succeeds
- Edits files are too large to reach a size of 64MB and are closed to open a new Edits
- Edits is going to open a new one in an hour
- A reboot will also open a new Edits
- Edits are small files, should not be too many, you need to merge them, the resulting file is fsimage, as the continuous merge, this file will become larger and larger, close to the memory size
- Fsimage is a relatively intact metadata. When HDFS just starts, it first loads the data from the Fsimage file into memory, and then loads the Edits file into memory. At this time, all the metadata is restored to memory
- SecondaryNameNode: Helps NameNode merge edits into fsimage
NameNode files operate on streams of data that do not pass through NameNodes and will tell the DataNode to be processed
The NameNode makes the decision to prevent replicas based on the global situation
For a balance of heartbeat mechanisms and status reports, see the architecture above

HDFS read and write process

write

read

SNN (secondarynamenode)

The first step is to understand what the fsimage and edits files do

Edits: holds the operation log,

Fsimage: generated by merging edits and is done by SNN

When edits exceed 64M, SNN alerts NameNode to create a new edits
New one every hour
New on Restart
Under special circumstances, SNN can be used for metadata recovery, to restore to the state of one hour ago

Basic introduction

composition

architecture

HDFS operation

HDFS senior shell

HDFS principle

The NameNode stores the metadata process

HDFS read and write process

SNN (secondarynamenode)

Related Posts

Do Data Analysis, Beware of Five Mistakes – SmartBi

Apache SuperSet 1.2.0 Tutorial (3) — Detailed diagram features

Hologres Reveals the Core Principles of High Performance Native Acceleration MaxCompute