Introduction to the
Install CDH5.14.2 based on Centos7
Download CentOS7
CM Download Address
CDH download address
CM and CDH, the major version should be consistent, such as CM 5.14, CDH 5.14, also should correspond to the system version.
Here’s what you can download:
CentOS
CM
CDH: CDH 5.14.2-1. Cdh5.14.2. P0.3 – el7. Parcel. Sha1
CDH 5.14.2-1. Cdh5.14.2. P0.3 – el7. Parcel
manifest.json
Environment to prepare
Install the system
The installation process will not be described. If you choose automatic partition during the installation process, you can adjust the partition size as needed.
== All of the following operations are done under root ==
Resize partitions
# # CentOS adjusted 7 home partitions root partition overall process: the # / home content backup, and then delete/home file system's logical volumes, expand/root file system, the newly built/home, / home restoration contents # 1. Tar CVF/TMP /home.tar/home #3 Unmount /home. If this cannot be done, terminate the process fuser-km /home/umount /home #4. Delete/home in lv lvremove/dev/mapper/centos - home # 5. /dev/mapper/centos-root #6 Extends /root LV to 800G lvextend-l +800G/dev/mapper/centos-root #6 Extending the /root filesystem xfs_growfs/dev/mapper/centos-root #7. /dev/mapper/centos-home #8 = /dev/mapper/centos-home #8 Create the file system mkfs.xfs /dev/mapper/centos-home #9. Tar XVF/TMP/home.tar-c /home/ CD /home/ home/home/mv *.. /
Mount the data disk
More than 2 t plate need to use the parted command, refer to: https://www.cnblogs.com/Eason…
Configure the network
vim /etc/sysconfig/network-scripts/ifcfg-eth1
disableIPV6
echo " " >> /etc/modprobe.d/dist.conf
echo "alias net-pf-10 off" >> /etc/modprobe.d/dist.conf
echo "alias ipv6 off" >> /etc/modprobe.d/dist.conf
Configure the hostname
hostnamectl set-hostname cdh1
exec bash
Set up thehosts
Vi /etc/hosts 192.168.140.110 cdh1 192.168.140.111 cdh2 192.168.140.112 cdh3
Shut downSELinux
vi /etc/selinux/config
SELINUX=disabled
setenforce 0
Turn off the firewall
systemctl stop firewalld
systemctl disable firewalld
Set the language code and time zone
Echo 'export TZ=Asia/Shanghai' >> /etc/profile; Echo 'export LANG= en_us.utf-8 '>> /etc/profile; source /etc/profile;
Modify the SSH port
Vim /etc/ssh/sshd_config Port=2200 systemctl restart SSHD /etc/ssh/sshd_config Port=2200 systemctl restart SSHD
Configure SSH free login
Ssh-keygen-t rsa # generate key: (default at ~/.ssh/) ssh-keygen-t rsa # copy public key to all machines SSH - copy - id root @ cdh1 SSH - copy - id root @ cdh2 SSH root - copy - id # @ cdh3 modify port, execute the following command, Or directly copy id_rsa.pub of all nodes to ~/.ssh/authorized_keys ssh-copy-id-i ~/.ssh/id_rsa.pub "-p 2200 root@cdh1" ssh-copy-id-i of all nodes ~/.ssh/id_rsa.pub "-p 2200 root@cdh2" ssh-copy-id -i ~/.ssh/id_rsa.pub "-p 2200 root@cdh3"
configurationNTP
service
Vi /etc/ntp.conf #master: Restrict 192.168.1.0 mask 255.255.255.0 Nomodify notrap server 127.127.1.0 fudge 127.127.1.0 folio 10 #slave Chkconfig NTPD on NTP: service NTPD start NTP: service NTPD start
Transparent large page adjustment:
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
swappiness
Adjust the
echo "vm.swappiness=10" >> /etc/sysctl.conf
Number of files open and user maximum number of processes:
# # set user maximum number of processes to vim/etc/security/limits the conf # # end to add the following * soft hard nofiles 32768 * 1048576 * nofiles soft nproc 65536 * hard nproc unlimited * soft memlock unlimited * hard memlock unlimited
Java
The official download address: https://www.oracle.com/techne…
Select the RPM installation package and install it on all nodes
The default path for an RPM installation is
CDH
Look for it when you install it
jdk
Path, otherwise in
CM
An error will be reported during the installation ofThe RPM installation automatically creates environment variables
<img SRC =” assets/image-20191106204438971. PNG “style=” font-size: 16px; font-family: Arial, sans-serif;” />
# check whether you have installed Java RPM - qa | grep Java # unload bring its RPM -e -- nodeps XXX # installed Java RPM - the ivh JDK - 8 u91 - Linux - x64. RPM The # RPM package does not require us to configure environment variables. Echo "JAVA_HOME=/usr/ Java /latest/" >> /etc/environment source /etc/environment
Mysql
The official download address: https://dev.mysql.com/downloa…
Installation reference: https://segmentfault.com/a/11…
<img SRC =” assets/image-20191106202109140. PNG “Alt =” image-20191106202109140. PNG” style=”zoom:67%;” />
<img SRC =” assets/image-20191106202155189. PNG “Alt =”image-20191106202155189″ style=”zoom:67%;” > <img SRC =” assets/image-20191106202155189. />
The installation
# test whether have mariadb and mysql RPM - qa | grep mariadb RPM - qa | grep -i mysql # if you have, RPM -e --nodeps mariadb-libs-5.5.56-2.el7.x86_64 # Unzip the tar-xvf mysql-5.6.38-1.el6.x86_64.rpm -bundle.tar El6.x86_64. RPM RPM -ivh mysql-client -5.6.38-1.el6.x86_64.rpm
configurationmy.cnf
[client] default-character-set=utf8 [mysqld] default-storage-engine=INNODB character-set-server=utf8 collation-server=utf8_general_ci transaction-isolation = READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: # symbolic-links = 0 key_buffer_size = 32M max_allowed_packet = 32M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 max_connections = 550 #expire_logs_days = 10 #max_binlog_size = 100M #log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system #and chown the specified folder to the mysql user. Log_bin =/var/lib/ mysql_binary_log # For MySQL version 5.1.8 or later. For older versions, reference MySQL documentation for configuration help. binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid sql_mode=STRICT_ALL_TABLES
Set boot up
Chkconfig --add mysqld; chkconfig --add mysqld
restartmysql
Mysql /root/. Mysql_secret mysql/uroot-p123456 mysql/uroot-p123456 mysql/uroot-p123456 mysql/uroot-p123456
Initialize themysql
Grant all privileges on . to 'root'@'%' identified by '123456' with grant option;
flush privileges;
create database if not exists amon default charset utf8 collate utf8_general_ci;
create database if not exists rman default charset utf8 collate utf8_general_ci;
create database if not exists nav default charset utf8 collate utf8_general_ci;
create database if not exists navms default charset utf8 collate utf8_general_ci;
create database if not exists hue default charset utf8 collate utf8_general_ci;
create database if not exists sentry default charset utf8 collate utf8_general_ci;
create database if not exists hive;
create database if not exists oozie;
create database if not exists scm;
grant all on hive.* to 'hive'@'%' identified by 'hive' with grant option;
grant all on oozie.* to 'oozie'@'%' identified by 'oozie' with grant option;
grant all on hue.* to 'hue'@'%' identified by 'hue' with grant option;
grant all on amon.* to 'amon'@'%' identified by 'amon' with grant option;
grant all on rman.* to 'rman'@'%' identified by 'rman' with grant option;
grant all on nav.* to 'nav'@'%' identified by 'nav' with grant option;
grant all on navms.* to 'navms'@'%' identified by 'navms' with grant option;
grant all on sentry.* to 'sentry'@'%' identified by 'sentry' with grant option;
grant all on *.* to 'scm'@'%' identified by 'scm' with grant option;
flush privileges;
willMysql
Driver package into the Java shared directory, three servers have to do
Mysql/mysql-connector-java-5.1.44-bin.jar Mysql/connector-java.jar Mysql/connector-java.jar/mysql-connector-java.jar/mysql-connector-java.jar/mysql-connector-java.jar/mysql-connector-java.jar /usr/share/ Java / # mysql/mysql-connector-java.jar /usr/share/ Java / Distribute the MySQL driver package to the other two servers scp-p2200 /usr/share/java/mysql-connector-java.jar root@cdh1:/usr/share/java scp-p2200 /usr/share/java/mysql-connector-java.jar root@cdh2:/usr/share/java scp -P2200 /usr/share/java/mysql-connector-java.jar root@cdh3:/usr/share/java
Cloudera Manager
Installation steps
U +w /etc/sudoers: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : "The dba ALL = (ALL) NOPASSWD: ALL "vim/etc/sudoers chmod u - w/etc/sudoers # 1, in ALL the nodes perform commands, create the directory: /opt/ cloudera-manager-centos7-cm5.14.2_x86_64.tar.gz: /opt/ cloudera-manager-centos7-cm5.14.2_x86_64.tar.gz: /opt/ cloudera-manager-centos7-cm5.14.2_x86_64.tar.gz: /opt/ cloudera-manager-centos7-cm5.14.2_x86_64.tar.gz At all nodes, Unzip the file to /opt/cloudera-manager tar-zxvf /opt/ cloudera-manager-centos7-cm5.14.2_x86_64.tar.gz-c /opt/cloudera-manager tar-zxvf # 4. On all nodes, create the user, Sudo useradd --system --home=/opt/cloudera-manager/cm-5.14.2/run/cloudera-scm-server --no-create-home =/opt/cloudera-manager/cm-5.14.2/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm #5. In the Server node, create the CM service local data storage directory. Sudo mkdir /var/lib/cloudera-scm-server sudo chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server #6 At all Agent nodes, /opt/cloudera-manager/cm-5.14.2/etc/cloudera-scm-agent/config.ini server_host=cdh1 #7 In all nodes, upload the database driven above prepared to/usr/share/Java directory, when installing mysql already upload # can download the JDBC driver to mysql's website, http://dev.mysql.com/downloads/connector/j/, Unzip it and find mysql-connector-java-5.1.**-bin.jar #8. Execute the command on the Server node. Initialize the database SCM. MySQL SCM SCM SCM SCM Database type the database name database user name password/opt/cloudera manager/cm - 5.14.2 / share/CMF/schema/scm_prepare_database. Sh mysql SCM SCM SCM # If not on a node, the command is: / opt/cloudera manager/cm - 5.14.2 / share/CMF/schema/scm_prepare_database. Sh -h 192.168.9.20 mysql SCM SCM SCM # 9. Mkdir /opt/cloudera/ CD /opt/cloudera mkdir Parcel - Repo #10 3 files of CDH prepared above, /opt/cloudera/parcel-repo # cdh-5.14.2-1.cdh5.14.2.p0.3-el7. Parcel # cdh-5.14.2.p0.3-el7. Parcel # cdh-5.14.2-1.cdh5.14.2.p0.3-el7 #manifest.json # must rename cdh-5.14.2-1.cdh5.14.2.p0.3-el7. Parcel. Sha #11 to cdh-5.14.2-1. In server nodes, the CM server cdh1 / opt/cloudera manager/CM - 5.14.2 / etc/init. D/cloudera - SCM - server start # 12. The agent node, start the CM agent cdh1 cdh3 / opt/cloudera manager/CM - 5.14.2 / etc/init. D/cloudera - SCM - agent start # 13. Wait a moment, access the address: serverip:7180, appear CM interface, username and password are admin. What follows is a step-by-step configuration.
== Be sure to rename cdh-5.14.2-1.cdh5.14.2.p0.3 -el7. Parcel. Sha ==
Open http://masterIP:7180/ with a browser to visit, login username and password are admin respectively
Failure to reinstall
Mysql > delete database from cmdatabase mysql > mysql > uroot-pmLAMP show database; drop database cm; Umount cm-5.16.2/run/cloudera-scm-agent/process # Uninstall rm-rf cm-5.16.2/
CDH
Then go to CM above. After logging in CM, tick Yes and continue
Select the free version and continue
Continue to
After each Agent node starts normally, the corresponding node can be seen in the host list currently managed. Select the nodes you want to install, select all nodes here, and proceed
The following version of the Parcel appears, indicating that the local Parcel is configured correctly
If the local Parcel is configured correctly, then the download is instantaneous, as there is no need to download it, and you can wait for the allocation process, depending on the speed of the internal network. Click Continue when you’re done.
Next is the server check, if you strictly follow the steps step by step to do down here should not appear any problems, all green through. Click Finish.
The next step is to select the installation service, select according to your needs, and click Continue
Service configuration, according to the node allocation, click Continue
Next is the setup of the database
The following is the review page of cluster setup, modify log and other directories, continue.
Pass all green and click Continue.
At this point the CDH cluster is deployed
Spark2
The download file
Download the installation package: http://archive.cloudera.com/s…
SPARK2_ON_YARN - 2.3.0. Cloudera3. Jar
Download the parcel file: http://archive.cloudera.com/s…
SPARK2-2.3.0. Cloudera3-1. Cdh5.13.3. P0.458809 - el7. Parcel
SPARK2-2.3.0. Cloudera3-1. Cdh5.13.3. P0.458809 - el7. Parcel. Sha1
manifest.json
Note: to choose the corresponding system, CDH version
El7 was selected on my system as CentOS7, and Cloudera3 was selected as the corresponding parcel
Installation steps
1. Upload the CSD package to the directory /opt/cloudera/ CSD on the CM node
Chown cloudera - SCM: cloudera - SCM SPARK2_ON_YARN - 2.3.0. Cloudera3. Jar
Spark2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6. Parcel /opt/cloudera/parcel-repo = /opt/cloudera/parcel-repo = /opt/ cloudera3-1.cdh5.13.3.p0.458809-el6. Parcel /opt/cloudera/parcel-repo = /opt/ cloudera3-1.cdh5.13.3.p0.458809-el6 Spark2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809-el6. Parcel. Sha Manifest. json 3, restart CM and cluster
/ opt/cloudera manager/cm - 5.14.2 / etc/init. D/cloudera - SCM - agent restart / opt/cloudera manager/cm - 5.14.2 / etc/init. D/cloudera - SCM - server restart
If you have a manifest. Json file, rename it before uploading it
4. Install SPARK2 via CM
CM page -> host -> Parcel page to see the new Spark2 Parcel packet
2.3.0. Cloudera3-1. Cdh5.13.3. P0.458809
Then go to Download – Assign – Activate
5. Add services in the aggregation
Go to Add Service and select Spark2 Service
Select a set of dependencies
To assign roles
Encryption is not selected by default
Proceed to the next installation
The installation is complete
When done, start the SPARK2 service
Component upgrade
hive:https://blog.csdn.net/weixin_…
In the pit of
1. Hive single-user problem
SPARK2-SHELL: After logging in to Hhive with HDFS user, a separate window (same server) is launched to log in to Spark2-shell with HDFS user. Error XSDB6: Another Derby instance may have booted the database (Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database)
Phenomenon 2: The operations (database building, table building, data interpolation) on CDH1 cannot be seen on CDH2-4, and the data between each node is isolated
When installing Hive, the metadata database selected is MySQL. You can also see the relevant configuration on the CM management page, but the default database for Hive on the server is Derby
Fix: Change the database configuration in hive-site. XML to MySQL
2, Spark2-shell startup error reported
Spark2-shell can only be executed on the SPARK2 installation node (where History Server is installed). The other nodes report an error: no self4j is found
Spark2 is a configuration file for /etc/spark2/conf. The other nodes do not
Solution: will install the node /etc/spark2/conf under the Spark2 profile uploaded to the /etc/spark2/conf/ directory of other nodes
The problem is similar to the one above, both of which are configuration files that are not valid. It is assumed that this is related to the change of SSH port (port 22 is not allowed on the internal network, so it is changed to other port).
3, Beeline and JDBC connection to Hive failed
Synopsis: Connecting to Hive with Beeline is on and off. JDBC connection timeouts
Positioning problem: in the hive – site. XML can’t find the hive. Server2. Thrift. Bind. Host configuration
Solution: modify the hive – site. XML, add hive. Server2. Thrift. Bind. Host:
The < property > < name > hive. Server2. Thrift. Bind. Host < / name > < value > 0.0.0.0 < / value > < description > host set to 0.0.0.0, To receive unknown source IP Bind host on which to run the HiveServer2 Thrift Interface. Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST </description> </property>
4, ElasticSearch6 Alias problem
Phenomenon: When creating an index using a script that does not specify an alias, it ends up with an alias on ES. Causes the search page to display the exception, the index operation Es can not find the index.
Problem location: check the page display field information abnormal, check the background log prompts can not find the index, check the index status, found alias
Solution: Delete alias
This article by the blog group, multiple articles and other operating tools platform
OpenWriterelease