Hadoop Installation and Configuration
Download and install Hadoop
Download the Hadoop package and decompress it to a specified directory. The download url is as follows:
Hadoop.apache.org/releases.ht…
Set environment variables
The main JDK, Hadoop environment variable configuration
JAVA_HOME = / Library/Java/JavaVirtualMachines jdk1.8.0 _201. JDK/Contents/Home CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar PATH=$JAVA_HOME/bin:$PATH: Export JAVA_HOME export CLASSPATH export PATH export HADOOP_HOME=/Users/Bigdata/hadoop-3.1.4 export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native/ export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_ROOT_LOGGER=INFO,console export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"Copy the code
Configure Hadoop
Go to the /etc/hadoop directory and modify the following files
- core-site.xml
- hdfs-site.xml
- mapred.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.admin.user.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/ Users/Bigdata/hadoop - 3.1.4 / etc/hadoop, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/common / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/common/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/HDFS / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/HDFS/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/graphs / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/graphs/lib / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/yarn / *, / Users/Bigdata/hadoop - 3.1.4 / share/hadoop/yarn/lib / *</value>
</property>
Copy the code
- yarn-site.xml
4. Run Hadoop
Startup problem:
Enabling Remote Login
To enable SSH password-free login, run the following command and press Enter (to ask whether to overwrite, select Y) :
ssh-keygen -t rsa
Copy the code
1) Initialization:
hdfs namenode -format
Copy the code
2) start
3) Check whether the server is started
jps
Copy the code
4) check the namenode, enter the following url: http://localhost:9870/dfshealth.html#tab-overview
5) Start YARN
start-yarn.sh
Copy the code
I’ve got it up here.
Input http://localhost:8088/cluster in your browser, after a successful start to appear the interface for the following:
6) Use HDFS
- Creating a file Directory
hdfs fs -mkdir /input
Copy the code
- Viewing file Directories
hdfs fs -ls /
Copy the code
Create three file directories as follows:
7) Test hadoop’s wordCount starter case
Preparation: To upload the written test TXT file to the input directory, simply execute the following directory:
HDFS dfs-put Local TXT file path /inputCopy the code
Then go to the share/hadoop directory in the home directory and run the following command:
Hadoop jar hadoop-mapreduce-examples-3.1.4.jar wordcount /input/ inputWord /output/wordcountCopy the code
The Hadoop UI already has a job running
View production files:
If you want to delete folders, use the following command:
hadoop fs -rm -r /output
Copy the code
To enable hadoop debug log output, add the following configuration to the ~/. Bash_profile file:
Export HADOOP_ROOT_LOGGER=DEBUG,console // Enable dubug export HADOOP_ROOT_LOGGER=INFO,console // Disable debuggingCopy the code
5. Problems encountered:
1. The following warnings are always displayed when you run HDFS commands:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
The problem remains unresolved for the time being
Tracking UI cannot be accessed
Go to the Hadoop /sbin directory and run the following command to start the service. Use JPS to check whether JobHistoryServer is enabled:
Just visit again