Spark Standalone mode and Yarn mode

Many tutorials teach you how to deploy Standalone mode or Yarn mode. Each deployment mode has a separate folder (e.g. spark-local, spark-YARN), which is a waste of space. This tutorial is intended for those with some experience in big data deployment. Caution: 1. Workers, spark-env.sh, and spark-default.conf are in the conf folder of spark. 2.Standalone mode or Yarn mode tasks are now on the history server http://hd1:18080.

Profile configuration

/etc/profile

# Yarn mode related, originally configured in spark-env.sh
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
Copy the code

workers

Standalone mode Configuration. The Standalone mode requires three Standalone devices for computing and storage. Deploy yarn independently
hd1
hd2
hd3
Copy the code

spark-env.sh

# Common configuration
export JAVA_HOME=/opt/soft/java
The Standalone mode shows the Standalone Standalone mode. The Standalone HA mode shows the Standalone mode
#SPARK_MASTER_HOST=hd1
#SPARK_MASTER_PORT=7077
# Standalone HA mode
SPARK_MASTER_WEBUI_PORT=8082
export SPARK_DAEMON_JAVA_OPTS="
-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=hd1,hd2,hd3
-Dspark.deploy.zookeeper.dir=/spark"
# Common history server configuration
export SPARK_HISTORY_OPTS="
-Dspark.history.ui.port=18080
-Dspark.history.fs.logDirectory=hdfs://hd1:8020/spark/history
-Dspark.history.retainedApplications=30"

Copy the code

Create the corresponding file for the history server (if not available)

hdfs dfs -mkdir -p /spark/history

Copy the code

spark-default.conf

# Common history server configuration
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://hd1:8020/spark/history
# add two configuration and click yarn application history to forward to spark ui
# to add this configuration click http://hd2:8088/cluster/apps History can jump to the corresponding application inside spark 18080 server
spark.yarn.historyServer.address=hd1:18080
spark.history.ui.port=18080
Copy the code

validation

Start related components and run the following commands

# standalone (HA) modebin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://hd1:7077,hd2:7077 \ . / examples/jars/spark - examples_2. 12-3.1.1. Jar \ 10# yarn patternbin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ . / examples/jars/spark - examples_2. 12-3.1.1. Jar \ 10Copy the code

View the effect: Standalone (HA) mode Check whether http://hd1:8082 (Spark UI) and http://hd1:18080/ (history server) have corresponding information. Yarn model performed view (yarn UI) http://hd2:8088/cluster/apps and http://hd1:18080/ server (history) whether there is a corresponding information and click on the yarn of the UI history history can jump to the server.

Spark Standalone mode and Yarn mode

Spark Standalone mode and Yarn mode

Profile configuration

workers

spark-env.sh

spark-default.conf

validation

Related Posts

Netty heartbeat detection mechanism and Netty zero copy

Yii2.0 to build a complete e-commerce platform

The maximum value of the stack push, eject sequence && queue