Due to personal learning needs, I will study how to install Spark. However, due to my limited financial resources, I have not yet joined the cluster. Let’s try the stand-alone version of Spark first. If there is an expansion later, update the cluster installation tutorial synchronously.
All the following operations are based on the
root
user.
You need to install Scala
before installing Spark
, because Spark
depends on Scala
. So let's install Scala
first, and download the compressed package of Scala
from Scala official website.
Then we upload the compressed package to the Centos server, how to upload it will not be detailed here.
We put the compressed package in the /opt/scala
directory, and then unzip it.
Unzip command
tar -xvf scala-2.12.2.tgz
Add environment variables in /etc/profile
, add export SCALA_HOME=/opt/scala/scala-2.12.2
and add ${SCALA_HOME}/bin:
in path
.
Below are my environment variables.
export JAVA_HOME=/usr/local/java/jdk1.8.0_221
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export SCALA_HOME=/opt/scala/scala-2.12.2export PATH=${JAVA_HOME}/bin:${SCALA_HOME}/bin:$PATH
Then we can verify scala
:
At this point, the installation of scala
is complete, and the next step is the installation of Spark
~~~
Same as Scala
, let’s go shopping and download the package first, and then upload it to the server.
In the same way, we put the compressed package in the /opt/spark
directory, and then unzip it.
Unzip command
tar -xvf spark-2.4.3-bin-hadoop2.7.tgz
Similar to the small differences, add environment variables in /etc/profile
, add export SPARK_HOME=/opt/spark/spark-2.4.3-bin-hadoop2.7
and add ${SPARK_HOME}/bin in
path :
.
Below are my environment variables.
export JAVA_HOME=/usr/local/java/jdk1.8.0_221
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export SCALA_HOME=/opt/scala/scala-2.12.2export SPARK_HOME=/opt/spark/spark-2.4.3-bin-hadoop2.7export PATH=${JAVA_HOME}/bin:${SPARK_HOME}/bin:${SCALA_HOME}/bin:$PATH
First enter the conf
directory of the decompressed file, which is /opt/spark/spark-2.4.3-bin-hadoop2.7/conf/
, we can see that there is a template file, we copy
One serving.
cp spark-env.sh.template spark-env.sh
We edit the copied file and add the following content:
export JAVA_HOME=/usr/local/java/jdk1.8.0_221
export SCALA_HOME=/opt/scala/scala-2.12.2export SPARK_HOME=/opt/spark/spark-2.4.3-bin-hadoop2.7export SPARK_MASTER_IP=learn
export SPARK_EXECUTOR_MEMORY=1G
Similarly, we make a copy of slaves
cp slaves.template slaves
Edit slaves
, the content is localhost
:
localhost
Then we can test, /opt/spark/spark-2.4.3-bin-hadoop2.7
execute in this directory:
. /bin/run-example SparkPi 10
Here we can see that the execution has been successful.
The same as above is also in the /opt/spark/spark-2.4.3-bin-hadoop2.7
directory, execute:
. /bin/spark-shell
We can see the following results:
So far, the stand-alone version of Spark
is installed~~~
Recommended Posts