Environment to prepare
-
Prepare Linux (CentOS7) VMS
10.58.12.170 10.58.12.171 10.58.10.129 tdops Copy the code
-
Software version
- The JDK 1.8.0 comes with _60
- Scala 2.11.12
- Hadoop 3.1.3
- The spark 2.4.6
- Livy 0.7.0
Configure hosts
- sudo vim /etc/hosts
// Add the following host configuration 10.58.12.171 ailoan-VIP-d-012171.hz. td 10.58.12.170 ailoan-VIP-D-012170.hz. td 10.58.10.129 ailoan-vip-d-010129.hz.tdCopy the code
Configure no-secret login for the three machines
-
Install openssh server. –
sudo yum install openssh-server Copy the code
-
To generate a public key
Ssh-keygen -t rsa #Copy the code
-
Public keys append to authorized_keys
-
Test success
170 > SSH ailoan-VIP-d-012171.hz. td 170 > SSH ailoan-VIp-d-012171.hz. tdCopy the code
Install the JDK
Install Scala
-
Downloading the Installation package
Downloads.lightbend.com/scala/2.11….
-
Copy the installation package to the target machine
SCP {username}@localip:/Users/{username}/Downloads/ bigdata software /scala-2.11.12.tgz /usr/install/bigdataCopy the code
-
Unzip to the target file
Sudo tar -zxvf scala-2.11.12.tgzCopy the code
-
Configuring environment Variables
# editing environment variable sudo vim/etc/profile # to add the following configuration export SCALA_HOME = / usr/install/bigdata/scala - 2.11.12 export PATH=$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin #Copy the code
-
Test installation results
Welcome to Scala 2.11.12 (Java HotSpot(TM) 64-bit Server VM, Java 1.8.0_60).Copy the code
Install Hadoop
-
Downloading software Packages
CD/usr/install/bigdata wget HTTP: / / https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.1.3/hadoop-3.1.3.tar.gzCopy the code
-
Decompress to the specified directory
Sudo tar - ZXVF hadoop - 3.1.3. Tar. GzCopy the code
-
Assign users to the Hadoop directory
Sudo chown -r tdops:users hadoop-3.1.3Copy the code
-
Configure environment variables and application configuration
- Configure hadoop – env. Sh
# to jump to the hadoop installation directory configuration directory of the CD/usr/install/bigdata/hadoop - 3.1.3 / etc/hadoop vim hadoop - env. Sh # add JDK home directory for export JAVA_HOME = / usr/install/jdk1.8.0 _60 export HADOOP_LOG_DIR = / home/tdops/spark/hadoop - 3.1.3 / logsCopy the code
- Configuration of yarn – evn. Sh
Sh # add JDK home directory export JAVA_HOME=/usr/install/jdk1.8.0_60Copy the code
- Configure the core – site. The XML
<configuration> <property> <name>fs.defaultFS</name> <! - the hostname of the master - > <value>hdfs://ailoan-vip-d-012170.hz.td:9000/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/ home/tdops/spark/hadoop - 3.1.3 / TMP</value> </property> </configuration> Copy the code
- Configuration HDFS – site. XML
<configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>ailoan-vip-d-012170.hz.td:50090</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/ home/tdops/spark/hadoop - 3.1.3 / DFS/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/ home/tdops/spark/hadoop - 3.1.3 / DFS/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration> Copy the code
- Configuration mapred – site. XML
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> Copy the code
- Configuration of yarn – site. XML
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <! Demo - this configuration join temporarily solve the yarn running mode abend problem -- -- > https://stackoverflow.com/questions/41468833/why-does-spark-exit-with-exitcode-16 <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>ailoan-vip-d-012170.hz.td:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>ailoan-vip-d-012170.hz.td:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>ailoan-vip-d-012170.hz.td:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>ailoan-vip-d-012170.hz.td:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>ailoan-vip-d-012170.hz.td:8090</value> </property> Copy the code
-
Verifying the Installation
- accessAiloan-vip-d-012170.hz. td:8090
- Access the HADOOP HDFS webUI Decryption address: ailoan-VIP-D-012170.Hz. td:9870/
Install the Spark
-
Download the spark package
- Official website to download spark.apache.org/downloads.h…
- And upload it to the prepared service
SCP {username}@localip:/Users/{username}/Downloads/ bigdata software /spark-2.4.6-bin-hadoop2.7. TGZ /usr/install/bigdata
-
Decompress to the specified directory
Sudo tar ZXVF - spark - 2.4.6 - bin - hadoop2.7. TGZCopy the code
-
Modify the owning user and group
Sudo chown -r TDOPS: Users spark-2.4.6-bin-hadoop2.7Copy the code
-
Configure environment variables and properties
- Run the spark-env.sh command in $SPARK_HOME/conf
Export SCALA_HOME = / usr/install/bigdata/scala - 2.11.12 export JAVA_HOME = / usr/install/jdk1.8.0 _60 export HADOOP_HOME = / usr/install/bigdata/hadoop - 3.1.3 export HADOOP_CONF_DIR = $HADOOP_HOME/etc/hadoop export SPARK_LOG_DIR = / home/tdops/spark/spark - 2.4.6 / logs SPARK_MASTER_IP = ailoan - VIP - d - 012170. Hz. Td SPARK_LOCAL_DIRS = / usr/install/bigdata/spark - 2.4.6 - bin - hadoop2.7 SPARK_DRIVER_MEMORY = 512 mCopy the code
- Configure the slaves file under $SPARK_HOME/conf
// Set the two hosts as salve ailoan-VIP-D-012171.hz. td ailoan-VIP-D-010129.hz. tdCopy the code
-
To start the spark
cd $SPARK_HOME/sbin ./start-all.sh Copy the code
-
Verifying the Installation
- ailoan-vip-d-012170.hz.td:8080/
- JPS Checks whether the master has a master process and the slave has a worker process
- Run the following official demo program
./spark-submit --master spark://ailoan-vip-d-012170.hz.td:7077 --class org.apache.spark.examples.SparkPi --deploy-mode Cluster file: / TMP/spark - examples_2. 11-2.4.6. JarCopy the code
- The following figure shows the spark execution result on the webUI
Install livy
-
Downloading software Packages
Download wget directly from wget https://www.apache.org/dyn/closer.lua/incubator/livy/0.7.0-incubating/apache-livy-0.7.0-incubating-bin.zipCopy the code
-
Decompress to the specified directory
Sudo yum install unzip unzip apache-livy-0.7.0-bin. Zip apache-livy-0.7.0Copy the code
-
Modifying configuration files
- Add and configure livy-env.sh
Livy-env.sh. Template livy-env.sh Add the following information: sudo vim livy-env.sh JAVA_HOME=/usr/install/jdk1.8.0_60 HADOOP_CONF_DIR = / usr/install/bigdata/hadoop - 3.1.3 / etc/hadoop SPARK_HOME = / usr/install/bigdata/spark - 2.4.6 - bin - hadoop2.7 LIVY_LOG_DIR=/home/tdops/spark/livy/logsCopy the code
- Add and configure livy.conf
Livy.conf. template livy.conf // edit the configuration file. Conf livy.spark. Deploy -mode=cluster livy.spark. Master =spark:// ailoan-VIP-d-012170.hz. td:7077 livy.file.local-dir-whitelist=/tmpCopy the code
-
Start the livy
cd $LIVY_HOME/bin ./livy-server start Copy the code
-
Look at and verify LIvy
- Open ailoan-VIP-D-012170.hz.td :8998
- Execute a demo using Postman or Java code
package cn.xxx.yuntu.common.util.livy.core; import cn.xxx.yuntu.common.util.dto.ApiResult; import cn.xxx.yuntu.common.util.livy.vo.LivyArg; import cn.xxx.yuntu.common.util.livy.vo.LivyStatus; import cn.xxx.yuntu.common.util.livy.vo.LivyResult; import cn.xxx.yuntu.common.util.util.HttpUtil; import cn.xxx.yuntu.common.util.util.LogUtil; import com.alibaba.fastjson.JSON; import com.alibaba.fastjson.JSONArray; import com.alibaba.fastjson.JSONObject; import org.apache.commons.lang3.StringUtils; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; / * * *@author li.minqiang * @date2019/12/5 * / public class LivyClient { public static final String DELETED = "deleted"; public static final String LIVY_BATCH_URI = "%s/batches/%s"; public static final String MSG = "msg"; public static final String NOT_FOUND = "not found"; public static final String SESSION = "Session"; private final static String STARTING = "starting"; private LivyArg livyArg; private LivyClient(a) {}public static LivyClient getInstance(LivyArg livyArg) { LivyClient livyClient = new LivyClient(); livyClient.setLivyArg(livyArg); return livyClient; } public ApiResult submitSparkJar(a) { JSONObject data = new JSONObject(); data.put("file", livyArg.getJarPath()); data.put("className", livyArg.getClassName()); data.put("name"."testLivy" + System.currentTimeMillis()); data.put("executorCores", livyArg.getExecutorCores()); data.put("executorMemory", livyArg.getExecutorMemory()); data.put("driverCores", livyArg.getDriverCores()); data.put("driverMemory", livyArg.getDriverMemory()); data.put("numExecutors", livyArg.getNumExecutors()); data.put("conf", livyArg.getConf()); data.put("args", livyArg.getArgs()); ApiResult apiResult = HttpUtil.postJson(String.format("%s/batches", livyArg.getLivyServer()), data); JSONObject obj = (JSONObject) apiResult.getResult(); if(! apiResult.isSuccess() || StringUtils.isEmpty(obj.getString("state"))) { LogUtil.error("make livy request error:{}", JSON.toJSONString(apiResult)); return ApiResult.failure("Failed to submit livy task"); } LogUtil.info("livy submit result:{}", JSON.toJSONString(apiResult, true)); return ApiResult.successWithResult(obj.getString("id")); } private String getLivyUrl(a) { return String.format("%s/ui/batch/%s/log", livyArg.getLivyServer(), livyArg.getTaskId()); } private List<String> makeListLogs(JSONArray logs) { List<String> mlogs = new ArrayList<String>(); if (logs == null) { return mlogs; } for (int i = 0; i < logs.size(); i++) { mlogs.add(logs.getString(i)); } return mlogs; } private List<String> getLivyServerLogs(a) { String url = String.format("%s/batches/%s/log? size=-1", livyArg.getLivyServer(), livyArg.getTaskId()); ApiResult apiResult = HttpUtil.get(url); if (apiResult.isSuccess()) { JSONObject r = (JSONObject) apiResult.getResult(); return makeListLogs(r.getJSONArray("log")); } return new ArrayList<String>(); } public LivyArg getLivyArg(a) { return livyArg; } public void setLivyArg(LivyArg livyArg) { this.livyArg = livyArg; } public static void main(String[] args) { // JavaWordCount(); SparkPi(); } public static void SparkPi(a) { LivyClient livyClient = new LivyClient(); LivyArg livyArg = new LivyArg(); livyArg.setLivyServer("http://ailoan-vip-d-012170.hz.td:8998"); livyArg.setJarPath("HDFS: / / ailoan - VIP - d - 012170. Hz. Td: 9000 / example/spark - examples_2. 11-2.4.6. Jar"); livyArg.setClassName("org.apache.spark.examples.SparkPi"); livyArg.setExecutorCores(1); livyArg.setDriverCores(1); livyArg.setExecutorMemory("512M"); livyArg.setDriverMemory("512M"); livyClient.setLivyArg(livyArg); ApiResult apiResult = livyClient.submitSparkJar(); System.out.println(apiResult.getCode() + "-" + apiResult.getReason()); System.out.println(apiResult); } public static void JavaWordCount(a) { LivyClient livyClient = new LivyClient(); LivyArg livyArg = new LivyArg(); livyArg.setLivyServer("http://ailoan-vip-d-012170.hz.td/:8998"); livyArg.setJarPath("HDFS: / / ailoan - VIP - d - 012170. Hz. Td: 9000 / example/spark - examples_2. 11-2.4.6. Jar"); livyArg.setClassName("org.apache.spark.examples.JavaWordCount"); livyArg.setExecutorCores(1); livyArg.setDriverCores(1); livyArg.setExecutorMemory("512M"); livyArg.setDriverMemory("512M"); List<String> args = new ArrayList<>(); args.add("hdfs://ailoan-vip-d-012170.hz.td:9000/example/wordCount.txt"); livyArg.setArgs(args); livyClient.setLivyArg(livyArg); ApiResult apiResult = livyClient.submitSparkJar(); System.out.println(apiResult.getCode() + "-"+ apiResult.getReason()); System.out.println(apiResult); }}Copy the code
- Similarly, view the submission records and results on sparkWebUI