Environment to prepare

  • Prepare Linux (CentOS7) VMS

    10.58.12.170
    10.58.12.171
    10.58.10.129
    tdops
    Copy the code
  • Software version

    • The JDK 1.8.0 comes with _60
    • Scala 2.11.12
    • Hadoop 3.1.3
    • The spark 2.4.6
    • Livy 0.7.0

Configure hosts

  • sudo vim /etc/hosts
    // Add the following host configuration 10.58.12.171 ailoan-VIP-d-012171.hz. td 10.58.12.170 ailoan-VIP-D-012170.hz. td 10.58.10.129 ailoan-vip-d-010129.hz.tdCopy the code

Configure no-secret login for the three machines

  1. Install openssh server. –

    sudo yum install openssh-server
    Copy the code
  2. To generate a public key

    Ssh-keygen -t rsa #Copy the code
  3. Public keys append to authorized_keys

  4. Test success

    170 > SSH ailoan-VIP-d-012171.hz. td 170 > SSH ailoan-VIp-d-012171.hz. tdCopy the code

Install the JDK

Install Scala

  1. Downloading the Installation package

    Downloads.lightbend.com/scala/2.11….

  2. Copy the installation package to the target machine

    SCP {username}@localip:/Users/{username}/Downloads/ bigdata software /scala-2.11.12.tgz /usr/install/bigdataCopy the code
  3. Unzip to the target file

    Sudo tar -zxvf scala-2.11.12.tgzCopy the code
  4. Configuring environment Variables

    # editing environment variable sudo vim/etc/profile # to add the following configuration export SCALA_HOME = / usr/install/bigdata/scala - 2.11.12 export PATH=$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin #Copy the code
  5. Test installation results

    Welcome to Scala 2.11.12 (Java HotSpot(TM) 64-bit Server VM, Java 1.8.0_60).Copy the code

Install Hadoop

  1. Downloading software Packages

    CD/usr/install/bigdata wget HTTP: / / https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.1.3/hadoop-3.1.3.tar.gzCopy the code
  2. Decompress to the specified directory

    Sudo tar - ZXVF hadoop - 3.1.3. Tar. GzCopy the code
  3. Assign users to the Hadoop directory

    Sudo chown -r tdops:users hadoop-3.1.3Copy the code
  4. Configure environment variables and application configuration

    • Configure hadoop – env. Sh
    # to jump to the hadoop installation directory configuration directory of the CD/usr/install/bigdata/hadoop - 3.1.3 / etc/hadoop vim hadoop - env. Sh # add JDK home directory for export JAVA_HOME = / usr/install/jdk1.8.0 _60 export HADOOP_LOG_DIR = / home/tdops/spark/hadoop - 3.1.3 / logsCopy the code
    • Configuration of yarn – evn. Sh
    Sh # add JDK home directory export JAVA_HOME=/usr/install/jdk1.8.0_60Copy the code
    • Configure the core – site. The XML
    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <! - the hostname of the master - >
            <value>hdfs://ailoan-vip-d-012170.hz.td:9000/</value>
        </property>
        <property>
             <name>hadoop.tmp.dir</name>
             <value>/ home/tdops/spark/hadoop - 3.1.3 / TMP</value>
        </property>
    </configuration>
    
    Copy the code
    • Configuration HDFS – site. XML
    <configuration>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>ailoan-vip-d-012170.hz.td:50090</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/ home/tdops/spark/hadoop - 3.1.3 / DFS/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>/ home/tdops/spark/hadoop - 3.1.3 / DFS/data</value>
        </property>
        <property>
            <name>dfs.replication</name>
            <value>3</value>
        </property>
    </configuration>
    
    Copy the code
    • Configuration mapred – site. XML
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    </configuration>
    Copy the code
    • Configuration of yarn – site. XML
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <! Demo - this configuration join temporarily solve the yarn running mode abend problem -- -- > https://stackoverflow.com/questions/41468833/why-does-spark-exit-with-exitcode-16
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>ailoan-vip-d-012170.hz.td:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>ailoan-vip-d-012170.hz.td:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>ailoan-vip-d-012170.hz.td:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>ailoan-vip-d-012170.hz.td:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>ailoan-vip-d-012170.hz.td:8090</value>
    </property>
    Copy the code
  5. Verifying the Installation

  • accessAiloan-vip-d-012170.hz. td:8090
  • Access the HADOOP HDFS webUI Decryption address: ailoan-VIP-D-012170.Hz. td:9870/

Install the Spark

  1. Download the spark package

    • Official website to download spark.apache.org/downloads.h…
    • And upload it to the prepared service
    • SCP {username}@localip:/Users/{username}/Downloads/ bigdata software /spark-2.4.6-bin-hadoop2.7. TGZ /usr/install/bigdata
  2. Decompress to the specified directory

    Sudo tar ZXVF - spark - 2.4.6 - bin - hadoop2.7. TGZCopy the code
  3. Modify the owning user and group

    Sudo chown -r TDOPS: Users spark-2.4.6-bin-hadoop2.7Copy the code
  4. Configure environment variables and properties

    • Run the spark-env.sh command in $SPARK_HOME/conf
    Export SCALA_HOME = / usr/install/bigdata/scala - 2.11.12 export JAVA_HOME = / usr/install/jdk1.8.0 _60 export HADOOP_HOME = / usr/install/bigdata/hadoop - 3.1.3 export HADOOP_CONF_DIR = $HADOOP_HOME/etc/hadoop export SPARK_LOG_DIR = / home/tdops/spark/spark - 2.4.6 / logs SPARK_MASTER_IP = ailoan - VIP - d - 012170. Hz. Td SPARK_LOCAL_DIRS = / usr/install/bigdata/spark - 2.4.6 - bin - hadoop2.7 SPARK_DRIVER_MEMORY = 512 mCopy the code
    • Configure the slaves file under $SPARK_HOME/conf
    // Set the two hosts as salve ailoan-VIP-D-012171.hz. td ailoan-VIP-D-010129.hz. tdCopy the code
  5. To start the spark

    cd $SPARK_HOME/sbin
    
    ./start-all.sh
    Copy the code
  6. Verifying the Installation

    • ailoan-vip-d-012170.hz.td:8080/
    • JPS Checks whether the master has a master process and the slave has a worker process
    • Run the following official demo program
      ./spark-submit --master spark://ailoan-vip-d-012170.hz.td:7077 --class org.apache.spark.examples.SparkPi --deploy-mode Cluster file: / TMP/spark - examples_2. 11-2.4.6. JarCopy the code
    • The following figure shows the spark execution result on the webUI

Install livy

  1. Downloading software Packages

    Download wget directly from wget https://www.apache.org/dyn/closer.lua/incubator/livy/0.7.0-incubating/apache-livy-0.7.0-incubating-bin.zipCopy the code
  2. Decompress to the specified directory

    Sudo yum install unzip unzip apache-livy-0.7.0-bin. Zip apache-livy-0.7.0Copy the code
  3. Modifying configuration files

    • Add and configure livy-env.sh
    Livy-env.sh. Template livy-env.sh Add the following information: sudo vim livy-env.sh JAVA_HOME=/usr/install/jdk1.8.0_60 HADOOP_CONF_DIR = / usr/install/bigdata/hadoop - 3.1.3 / etc/hadoop SPARK_HOME = / usr/install/bigdata/spark - 2.4.6 - bin - hadoop2.7 LIVY_LOG_DIR=/home/tdops/spark/livy/logsCopy the code
    • Add and configure livy.conf
    Livy.conf. template livy.conf // edit the configuration file. Conf livy.spark. Deploy -mode=cluster livy.spark. Master =spark:// ailoan-VIP-d-012170.hz. td:7077 livy.file.local-dir-whitelist=/tmpCopy the code
  4. Start the livy

    cd $LIVY_HOME/bin
    
    ./livy-server start
    Copy the code
  5. Look at and verify LIvy

    • Open ailoan-VIP-D-012170.hz.td :8998
    • Execute a demo using Postman or Java code
    package cn.xxx.yuntu.common.util.livy.core;
    
    
    import cn.xxx.yuntu.common.util.dto.ApiResult;
    import cn.xxx.yuntu.common.util.livy.vo.LivyArg;
    import cn.xxx.yuntu.common.util.livy.vo.LivyStatus;
    import cn.xxx.yuntu.common.util.livy.vo.LivyResult;
    import cn.xxx.yuntu.common.util.util.HttpUtil;
    import cn.xxx.yuntu.common.util.util.LogUtil;
    import com.alibaba.fastjson.JSON;
    import com.alibaba.fastjson.JSONArray;
    import com.alibaba.fastjson.JSONObject;
    import org.apache.commons.lang3.StringUtils;
    
    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    
    / * * *@author li.minqiang
     * @date2019/12/5 * /
    public class LivyClient {
    
        public static final String DELETED = "deleted";
        public static final String LIVY_BATCH_URI = "%s/batches/%s";
        public static final String MSG = "msg";
        public static final String NOT_FOUND = "not found";
        public static final String SESSION = "Session";
        private final static String STARTING = "starting";
        private LivyArg livyArg;
    
        private LivyClient(a) {}public static LivyClient getInstance(LivyArg livyArg) {
            LivyClient livyClient = new LivyClient();
            livyClient.setLivyArg(livyArg);
            return livyClient;
        }
    
    
        public ApiResult submitSparkJar(a) {
            JSONObject data = new JSONObject();
            data.put("file", livyArg.getJarPath());
            data.put("className", livyArg.getClassName());
            data.put("name"."testLivy" + System.currentTimeMillis());
            data.put("executorCores", livyArg.getExecutorCores());
            data.put("executorMemory", livyArg.getExecutorMemory());
            data.put("driverCores", livyArg.getDriverCores());
            data.put("driverMemory", livyArg.getDriverMemory());
            data.put("numExecutors", livyArg.getNumExecutors());
            data.put("conf", livyArg.getConf());
            data.put("args", livyArg.getArgs());
    
            ApiResult apiResult = HttpUtil.postJson(String.format("%s/batches", livyArg.getLivyServer()), data);
    
            JSONObject obj = (JSONObject) apiResult.getResult();
    
            if(! apiResult.isSuccess() || StringUtils.isEmpty(obj.getString("state"))) {
                LogUtil.error("make livy request error:{}", JSON.toJSONString(apiResult));
                return ApiResult.failure("Failed to submit livy task");
            }
    
            LogUtil.info("livy submit result:{}", JSON.toJSONString(apiResult, true));
            return ApiResult.successWithResult(obj.getString("id"));
        }
    
        private String getLivyUrl(a) {
            return String.format("%s/ui/batch/%s/log", livyArg.getLivyServer(), livyArg.getTaskId());
        }
    
        private List<String> makeListLogs(JSONArray logs) {
            List<String> mlogs = new ArrayList<String>();
            if (logs == null) {
                return mlogs;
            }
            for (int i = 0; i < logs.size(); i++) {
                mlogs.add(logs.getString(i));
            }
            return mlogs;
        }
    
        private List<String> getLivyServerLogs(a) {
            String url = String.format("%s/batches/%s/log? size=-1", livyArg.getLivyServer(), livyArg.getTaskId());
            ApiResult apiResult = HttpUtil.get(url);
            if (apiResult.isSuccess()) {
                JSONObject r = (JSONObject) apiResult.getResult();
                return makeListLogs(r.getJSONArray("log"));
            }
            return new ArrayList<String>();
        }
    
        public LivyArg getLivyArg(a) {
            return livyArg;
        }
    
        public void setLivyArg(LivyArg livyArg) {
            this.livyArg = livyArg;
        }
    
        public static void main(String[] args) {
    // JavaWordCount();
            SparkPi();
        }
    
        public static void SparkPi(a) {
            LivyClient livyClient = new LivyClient();
            LivyArg livyArg = new LivyArg();
            livyArg.setLivyServer("http://ailoan-vip-d-012170.hz.td:8998");
            livyArg.setJarPath("HDFS: / / ailoan - VIP - d - 012170. Hz. Td: 9000 / example/spark - examples_2. 11-2.4.6. Jar");
            livyArg.setClassName("org.apache.spark.examples.SparkPi");
            livyArg.setExecutorCores(1);
            livyArg.setDriverCores(1);
            livyArg.setExecutorMemory("512M");
            livyArg.setDriverMemory("512M");
    
            livyClient.setLivyArg(livyArg);
    
            ApiResult apiResult = livyClient.submitSparkJar();
            System.out.println(apiResult.getCode() + "-" + apiResult.getReason());
            System.out.println(apiResult);
        }
    
        public static void JavaWordCount(a) {
            LivyClient livyClient = new LivyClient();
            LivyArg livyArg = new LivyArg();
            livyArg.setLivyServer("http://ailoan-vip-d-012170.hz.td/:8998");
            livyArg.setJarPath("HDFS: / / ailoan - VIP - d - 012170. Hz. Td: 9000 / example/spark - examples_2. 11-2.4.6. Jar");
            livyArg.setClassName("org.apache.spark.examples.JavaWordCount");
            livyArg.setExecutorCores(1);
            livyArg.setDriverCores(1);
            livyArg.setExecutorMemory("512M");
            livyArg.setDriverMemory("512M");
            List<String> args = new ArrayList<>();
            args.add("hdfs://ailoan-vip-d-012170.hz.td:9000/example/wordCount.txt");
            livyArg.setArgs(args);
    
            livyClient.setLivyArg(livyArg);
    
            ApiResult apiResult = livyClient.submitSparkJar();
            System.out.println(apiResult.getCode() + "-"+ apiResult.getReason()); System.out.println(apiResult); }}Copy the code
    • Similarly, view the submission records and results on sparkWebUI