Quickly set up the Hadoop environment

1. Build the basic cluster environment

1.1 to install the JDK

A. upload the JDK 8 u151 – Linux – x64. Tar. Gz

B. Decompress the package to the usr directory

tar -zxvf jdk-8u151-linux-x64.tar.gz
Copy the code

C. Configure environment variables (1)vim /etc/profile (2) Add the following at the end:

JAVA_HOME=/usr/java18.
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME PATH
Copy the code

(3) Save the Settings and exit. To check whether the installation is successful, run the Java -version command

1.2. Change the host name and disable the firewall

Hostnamectl set-hostname hadoop02 2. Xshell close the connection window and re-connect to the host.

  • (1)firewall-cmd --stateCheck the firewall status
  • (2)systemctl stop firewalld.service# stop the firewall
  • (3)systemctl disable firewalld.serviceDisable firewall startup

1.3. Add Intranet Domain name Mapping. [Centos IP Address static

cd /etc/sysconfig/network-scripts 
Copy the code

Ls Finds the corresponding file

BOOTPROTO=static
ONBOOT=yes
IPADDR= 192.168137.100. / / IP address
NTSMASK=255.255255.. 0 
GATEWAY=192.168137.2.  // Default gateway
Copy the code

Service network restart Restart the gateway

Modify the configuration file vim /etc/hosts

1.4. Configure password-free login

1. In the root login state, run the commandssh-keygenorssh-keygen -t rsa

2. You will find that the public key file is generated in the /root/.ssh directory

Ssh-copy-id hadoop02 create password-free logins from hadoop01 to hadoop02

2. Hadoop cluster environment installation

2.1 Hadoop version selection

X 2.6.5 2.7.5 3.0.1 2. The commercial release of Apache provides a complete management system, and bug fixes may be ahead of Cloudera CDH: 5.7.x

2.2. Install Hadoop

Hadoop can be run in pseudo-distributed mode on a single node. Hadoop processes are run as separate Java processes. The nodes act as NameNode and DataNode, and read files in HDFS.

Hadoop configuration files are stored in the hadoop-2.7.5 /etc/hadoop-/ folder. Pseudo-distributed configuration files core-site. XML and hdFs-site. XML (hdFs-site. XML is used to configure the number of copies of data blocks. For pseudo-distribution, no matter how many replicas you configure, there will always be only one replica, so you don’t have to worry about it. Hadoop allocation files are in XML format, and each configuration file is implemented by declaring the name and value of a property

1. Modify the hadoop-env.sh configuration file and add the JDK installation directory

[root@hadoop01 hadoop]# vim hadoop-env.sh
Copy the code

2. Modify core-site.xml

<configuration> <! --> <property> <name> fs.defaultfs </name> <value> HDFS://hadoop01:9000</value></property> <! Dir </name> <value>/opt/hadoop-2.7. 5/temp</value>
    </property>
</configuration>
Copy the code

3. Modify slaves

hadoop01
Copy the code

4. Add the hadoop environment variable vim /etc/profile

HADOOP_HOME=/home/hadoop-2.7. 5
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_HOME PATH
Copy the code

5. Format namenode

[root@hadoop01 hadoop]#hadoop namenode -format
Copy the code

6. Start HDFS

[root@hadoop01 hadoop]#start-dfs.sh
Copy the code

7. Check whether the startup is successful

(1) Use the JPS tool to check whether each process is successfully started

(2) Use the Web UI to view http://hadoop01:50070