Series catalog:

Hadoop Combat (1) _ Aliyun builds the pseudo-distributed environment of Hadoop2.x

Hadoop Deployment (2) _ Vm Deployment of Hadoop in full distribution Mode

Creating a Linux VM on a full node

Client operating system: RHEL-server-6.5-x86_64.

Network connection: NAT mode. Connecet automaticcally: Check box. The IP address does not change with the host network segment.

hostname Address Netmask Gateway
cdhmaster 192.168.200.100 255.255.255.0 192.168.200.2
cdhslave1 192.168.200.101 255.255.255.0 192.168.200.2

Install type: Minimal

Turn off the firewall and SELINUX(full node)

# iptables
service iptables status
service iptables stop
# Do not automatically start with the operating system
chkconfig iptables off
Check if the name of the service contains table
chkconfig --list|grep table
Reboot Takes effect after the restart
vi /etc/selinux/config
SELINUX=disabled
Copy the code

Changing the host name and configuring hosts(for all nodes)

vi /etc/sysconfig/network

# cdhmasterNETWORKING=yes HOSTNAME= cDHmaster GATEWAY=192.168.200.2 service network restart# cdhslave1
NETWORKING=yes
HOSTNAME=cdhslave1
GATEWAY=192.168.200.2

service network restart
Copy the code
Vi /etc/hosts 192.168.200.100 cdhmaster 192.168.200.101 cdhslave1Copy the code

Set up local yum (master node)

Mount an ISO image file and copy the file content

mkdir -p /root/training/dvd
mount /dev/cdrom /mnt/dvd
df -h
cp -av /mnt/dvd/* /root/training/dvd
umount /mnt/dvd
Copy the code

Create the YUM configuration file

vi /etc/yum.repos.d/local.repo

[dvd]
name=install dvd
baseurl=file:///root/training/dvd
enabled=1
# enabled=0
gpgcheck=0

# validation
yum list | grep mysql
Copy the code

Set up local YUM source (HTTP)(master node)

Start the HTTPD service

Verify that the HTTPD service is installed
rpm -qa|grep httpd
# yum install -y httpd
yum install -y httpd
Start the HTTPD service
service httpd start
Set the HTTPD service to start automatically upon startup
chkconfig httpd on
Copy the code

Yum configuration source

# upload rhel6.5.tar to '/var/www/html' and unzip itThe tar XVF rhel6.5. TarTo verify, type http://cdhmaster/rhel6.5 in the browser address bar
Copy the code

Create the YUM configuration file

cp rhel-source.repo rhel-source.repo.bak
vi /etc/yum.repos.d/rhel-source.repo

[rhel-source]
name=Red Hat Enterprise Linux $releasever - $basearch- Source baseurl = http://cdhmaster/rhel6.5/ enabled = 1 gpgcheck = 0# set local.repo enabled=0
# validation
yum list | grep mysql
Copy the code

Yum Source Configuration (full node)

Create the YUM configuration file

cp rhel-source.repo rhel-source.repo.bak
vi /etc/yum.repos.d/rhel-source.repo

[rhel-source]
name=Red Hat Enterprise Linux $releasever - $basearch- Source baseurl = http://cdhmaster/rhel6.5/ enabled = 1 gpgcheck = 0# set local.repo enabled=0
# validation
yum list | grep mysql
Copy the code

NTP time synchronization (all nodes)

Check whether NTP is installed
rpm -qa | grep ntp
yum install -y ntp ntpdate
service ntpd start
chkconfig ntpd on
service ntpdate start
chkconfig ntpdate on
Copy the code
Select CDHMaster as NTP server, set system time and save
date -s "20171024 14:04:00"
hwclock --systohc
Copy the code
Synchronize time between nodes in the clusterConf server 192.168.200.100 service NTPD restart vi /etc/ntp.conf server 192.168.200.100 service NTPD restart# Validation test
ntpdc -c loopinfo
Copy the code

Configuring Kernel Parameters (all nodes)

To disable the large transparent page, check whether the large transparent page is enabled. [always] never indicates that the large transparent page is enabled.

# Close the large transparent page
cat /sys/kernel/mm/transparent_hugepage/defrag
# [always] madvise never
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Add the following two lines to /etc/rc.local
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
Copy the code

To disable memory swapping, the Linux kernel parameter vm.swappiness, whose value ranges from 0 to 100, indicates when the system starts swapping physical and virtual memory. For example, if the total system memory is 64 GB and vm.swappiness is 60, it indicates that the physical memory and virtual memory will be swapped when the system memory usage is 64*0.4=25.6 GB. This action will inevitably affect the system performance. Cloudera therefore recommends changing the value to 1 to 10.

# Disable memory swap. Default value: 60, effective temporarily
sysctl vm.swappiness=0
Write environment control, take effect permanently after restart
echo "vm.swappiness=0" >> /etc/sysctl.conf 
# validation
cat /proc/sys/vm/swappiness
Copy the code

CDH installation and configuration

The generation of cm5.9.0.tar package

1, the cm5 RPM package, http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.9.0/RPMS/x86_64/

Cloudera manager - agent - 5.9.0-1. Cm590. P0.249. El6. X86_64. RPM cloudera manager - daemons 5.9.0-1. Cm590. P0.249. El6. X86_64. RPM Cloudera manager - server - 5.9.0-1. Cm590. P0.249. El6. X86_64. RPM Cloudera manager - server - db - 2-5.9.0-1. Cm590. P0.249. El6. X86_64. RPM Enterprise - debuginfo - 5.9.0-1. Cm590. P0.249. El6. X86_64. RPM JDK - 6 u31 - Linux - amd64. RPM Oracle - j2sdk1. 7-1.7.0 + update67-1. X86_64. RPMCopy the code

2, cloudera manager – installer. Bin, http://archive.cloudera.com/cm5/installer/5.9.0/

cloudera-manager-installer.bin
Copy the code

3, a parcel of cdh5, http://archive.cloudera.com/cdh5/parcels/5.9.0/

CDH 5.9.0-1. Cdh5.9.0. P0.23 - el6. Parcel CDH 5.9.0-1. Cdh5.9.0. P0.23 - el6. Parcel. Sha manifest. JsonCopy the code

4, generate repodata directory files,

Question: / var/WWW/HTML/cm5.9.0 / repodata of files is how to produce?

1, yum install -y createrepo. Noarch

which createrepo
yum list | grep createrepo
yum install -y createrepo.noarch
Copy the code

2, manual download the RPM file to cm5.9.0 directory under the http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.9.0/RPMS/x86_64/.

3. Run createrepo cm5.9.0 to recreate the files in the data directory. Then the browser input that can be accessed at http://cdhmaster/cm5.9.0/. Note that the HTTPD service is started.

Configure cm Yum local source (master node)

# upload cm5.9.0.tar to '/var/www/html' and unzip itThe tar XVF cm5.9.0. TarTo verify, type http://cdhmaster/cm5.9.0 in the browser address bar
Copy the code

Creating a CM yum Configuration File (full node)

vi /etc/yum.repos.d/cloudera-manager.repo

[cloudera-manager]
name = Cloudera Manager, Version 5.9.0
baseurl = http://cdhmaster/cm5.9.0/
gpgcheck = 0

# validation
yum list | grep cloudera
Copy the code

Installing JDK (full node)

The master node will install the JDK during the Cloudera Manager installation. The default installation is /usr/java/jdk1.7.0_67-cloudera. If the installation fails, JDK is not displayed in the log. You need to install JDK on the slave node.

Oracle-j2sdk1.7-1.7.0 +update67-1.x86_64. RPM to /root/trainingRPM - the ivh oracle - j2sdk1. 7-1.7.0 + update67-1. X86_64. RPM/usr/ Java is installed in /usr/java by default.
Copy the code

Configuring a Parcel (Master Node)

mkdir -p /opt/cloudera/parcel-repo
Upload the following three files to this directory
# CDH 5.9.0-1. Cdh5.9.0. P0.23 - el6. Parcel
# CDH 5.9.0-1. Cdh5.9.0. P0.23 - el6. Parcel. Sha
# manifest.json
Copy the code

Perform cm installation (master node)

Installation process logs are recorded in /var/log/cloudera-manager-installer

cdThe/var/WWW/HTML/cm5.9.0# do not generate the repo file in /etc/yum.repos
./cloudera-manager-installer.bin --skip_repo_package=1
Copy the code

Press Next. If the following interface is displayed, cm is successfully installed.

Point your web browser to http://192.168.200.100:7180/. Log in to Cloudera Manager with username: 'admin' and password: 'admin' to continue installation.
Copy the code

Wait a few minutes (depending on the machine configuration), the browser open http://192.168.200.100:7180/, add the service on demand.

Cm installation logs are recorded in /var/log/cloudera-manager-installer.

Parcel installation logs are recorded in /var/log/cloudera-scm-agent and /var/log/cloudera-scm-server.


The wechat official account “Data Analysis” is used to share self-cultivation of data scientists. Since we met each other, it is better to grow up together.

Reprint please specify: Reprint from wechat official account “Data Analysis”


Reader communication telegraph group:

https://t.me/sspadluo