The commonly used
- –master
- –deploy-mode
- –class
- –name
- –jars
- –conf
- –driver-memory
- –driver-class-path
- –executor-memory
- –driver-cores
- –executor-cores
- –queue
- –num-executors
All of the
Options:
-
–deploy-mode client/cluster
- If the machine or shell breaks, the application will hang
- Cluster: Driver starts a Driver on ApplicationMaster assigned by Yarn. Interact with other Excute
-
–master
- Write on what Spark runs on
- SparkOnYarn Example: yarn
-
–class CLASS_NAME
- CLASS_NAME: path of the main method package
- Example: com.xinxing.examples.NiuBi
-
–name NAME
- NAME: Specifies the NAME displayed on the ApplicationsUI page of YARN. The default value is
CLASS_NAME
- NAME: Specifies the NAME displayed on the ApplicationsUI page of YARN. The default value is
-
–jars JARS
- JARS: The JARS your program depends on. If you have more than one
.
separated - Example: mysql. Jar, scala – JDBC. Jar
- JARS: The JARS your program depends on. If you have more than one
-
–packages groupId:artifactId:version
- GroupId: artifactId: version: rely on GAV address, multiple use
.
separated - Example: mysql: mysql connector – Java: 5.1.27
- GroupId: artifactId: version: rely on GAV address, multiple use
-
–repositories http:/ww.xxx.xx/xxxx
- HTTP :/ w.xxx.xx/ XXXX: Maven source. Generally, the Maven repository address of Ali is used
--packages
use - Example:maven.aliyun.com/nexus/conte…
- HTTP :/ w.xxx.xx/ XXXX: Maven source. Generally, the Maven repository address of Ali is used
-
–exclude-packages groupId:artifactId:version
- Cooperate with
--packages
Use to resolve jar package dependency conflicts
- Cooperate with
-
–files FILES
- FILES: Similar to jars, the configuration FILES you need to use are loaded into the Driver
SparkFiles.get(fileName)
Access to the - Example: / home/configuration/app. Conf, / home/configuration/apps. Conf
- FILES: Similar to jars, the configuration FILES you need to use are loaded into the Driver
-
–conf PROP=VALUE
- Individual jobs need to be set separately
spark-conf
Parameter, just add it here. There are ten--conf
Ten times - Example: the spark. Shuffle. Registration. The timeout = 2000
- Individual jobs need to be set separately
-
–properties-file
- Specify the spark-default.conf path. If not, the default path is spark-default.conf
conf/spark-defaults.conf
- Specify the spark-default.conf path. If not, the default path is spark-default.conf
-
–driver-memory MEM
- MEM: The memory size is 1 GB by default
- Example: 4 g
-
–driver-java-options
- JVM parameters for the driver
- Example: JAVA_TOOL_OPTIONS = “- Xmx512m – Xms64m”
-
–driver-library-path
- The jar package directory (directory!) that the program depends on.
-
–driver-class-path
- The program depends on the JAR package, more than please use
.
Space. Note: The –jars parameters will also be loaded here, don’t add them again
- The program depends on the JAR package, more than please use
-
–executor-memory MEM
- Excutor memory to how much, default to 1G
- Example: 4 g
-
–proxy-user NAME
- As a proxy user, you write root and execute the program as root. Make sure you have the permission of root ~
-
–verbose
- Print out all the debug information. If there is an abnormal use of the general program
-
–version
- Prints the Spark version
Parameter unique to Yarn
- –queue QUEUE_NAME
- QUEUE_NAME: Specifies the queue name on which your application is running in the YARN resource pool. The default is
default
- QUEUE_NAME: Specifies the queue name on which your application is running in the YARN resource pool. The default is
- –executor-cores NUM
- NUM: Give executor several virtual cores. The default value is 1
- Example: 4
- –num-executors NUM
- NUM: Give several executors. 2 by default
- Example: 4
- –archives ARCHIVES
- No, I don’t know
- –principal PRINCIPAL
- No, I don’t know
- –keytab KEYTAB
- No, I don’t know