precondition
- Spark is installed.
- Ancona is installed
In order to write Scala and Spark on the Jupyter Notebook, we need to install two Jupyter cores (kernel) : Jupyter-Spark and Jupyter-Scala. Then Jupyter-Scala, documentation order for my installation order.
start
Use Apache Toree to install the Scala kernel for the notebook
Step one, install Toree
Download toree,
pip install toree
Copy the code
Step 2: Install Jupyter-Scala and start Spark
Jupyter toree install --spark_opts='--master= Spark ://localhost:7077' --user --kernel_name=Spark2.3.2 - spark_home = / home/fonttian/spark - - bin - hadoop2.7 2.3.2Copy the code
– Master Spark address – spark_HOME Spark download directory – kernel_name Can be viewed using spark-shell
The third step, detection of jupyter core, detection of new projects
Step 4: Create the Scala project and run it
In Jupyter, you can run scala statements directly as scripts
You can also define an object and run it using the main function.
Areas of attention
If You use Jupyter-Spark to start Scala, Even if spark is not used, Jupyter will start Spark by default. If you just want to practice Scala, you are advised to use jupyter-Scala to create a new project. The following is how to install Jupyter-Scala
Installing the Scala Core
If you are not familiar with Scala, you may also need to install the Scala core on Jupyter (usually using IDEA).
Download jupyter scala – cli
Please go to https://oss.sonatype.org/content/repositories/snapshots/com/github/alexarchambault/jupyter/ to download jupyter scala – cli files
Here the blogger is using the latest version, 2.11.6
Add the core
First decompress the file, and then install the file according to the drawing
Detection of core
jupyter kernelspec list
Copy the code
Detect the newly added core
Creating a Scala project
If you want to create an object and run it, run objectName.main(Array()) as described above