Detailed explanation of Spark installation and configuration tutorial under centOS7

**Environmental description: **

Operating system: centos7 64 bit 3 units
centos7-1 192.168.190.130 master
centos7-2 192.168.190.129 slave1
centos7-3 192.168.190.131 slave2

To install spark, you need to install the following at the same time:

jdk scale

  1. Install jdk, configure jdk environment variables

I won’t talk about how to install and configure jdk, Baidu by myself.

  1. Install scala

Download the scala installation package, https://www.scala-lang.org/download/ select the version that meets the requirements to download, and upload it to the server using the client tool. Unzip:

 # tar -zxvf scala-2.13.0-M4.tgz
 Modify again/etc/profile file, add the following content:
 export SCALA_HOME=$WORK_SPACE/scala-2.13.0-M4
 export PATH=$PATH:$SCALA_HOME/bin
 # source /etc/profile   //Make it effective immediately
 # scala -version      //Check whether scala is installed

** 3. Install spark**

Spark download address: http://spark.apache.org/downloads.html

Note: There are different version packages to download, just choose the download and install you need

Source code: Spark source code, you need to compile to use it, and Scala 2.11 need to use source code to compile it to use
Pre-build with user-provided Hadoop: "Hadoop free" version, applicable to any Hadoop version
Pre-build for Hadoop 2.7 and later: A pre-build version based on Hadoop 2.7, which needs to correspond to the Hadoop version installed on this machine. Hadoop 2.6 is also optional. Because the hadoop installed here is 3.1.0, so I installed for hadoop 2.7 and later version directly.

Note: Please check my previous blog for the installation of hadoop, and the description will not be repeated.

Spark installation and configuration under centOS7
# mkdir spark 
# cd /usr/spark
# tar -zxvf spark-2.3.1-bin-hadoop2.7.tgz
# vim /etc/profile
# Add spark environment variables, add them under PATH and export them
# source /etc/profile
# Enter the conf directory and put spark-env.sh.A copy of template and renamed spark-env.sh
# cd /usr/spark/spark-2.3.1-bin-hadoop2.7/conf
# cp spark-env.sh.template spark-env.sh
# vim spark-env.sh
export SCALA_HOME=/usr/scala/scala-2.13.0-M4
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.x86_64
export HADOOP_HOME=/usr/hadoop/hadoop-3.1.0export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_HOME=/usr/spark/spark-2.3.1-bin-hadoop2.7export SPARK_MASTER_IP=master
export SPARK_EXECUTOR_MEMORY=1G
# Enter the conf directory and put slaves.Copy a copy of template and rename it to slaves
# cd /usr/spark/spark-2.3.1-bin-hadoop2.7/conf
# cp slaves.template slaves
# vim slaves
# Add the node domain name to the slaves file
# master   //The domain name is centos7-1 domain name
# slave1   //The domain name is centos7-2 domain names
# slave2   //The domain name is centos7-3 domain names

Start spark

# Start up the hadoop node before starting spark
# cd /usr/hadoop/hadoop-3.1.0/
# sbin/start-all.sh
# jps //Check whether the started thread has started hadoop
# cd /usr/spark/spark-2.3.1-bin-hadoop2.7
# sbin/start-all.sh
Remarks: in slave1\On the slave2 node, spark must also be installed in the above way, or a copy directly to slave1,slave2 node
# scp -r /usr/spark root@slave1ip:/usr/spark

The startup information is as follows:

starting org.apache.spark.deploy.master.Master, logging to /usr/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
slave2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.com.cn.out
slave1: starting org.apache.spark.deploy.worker.Worker, logging to /usr/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.com.cn.out
master: starting org.apache.spark.deploy.worker.Worker, logging to /usr/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out

Test the Spark cluster:

Use a browser to open the spark cluster url on the master node: http://192.168.190.130:8080/

to sum up

The above is a detailed explanation of the Spark installation and configuration tutorial under centOS7 introduced by the editor. I hope it will be helpful to you. If you have any questions, please leave a message to me. The editor will reply to you in time!

Recommended Posts

Detailed explanation of Spark installation and configuration tutorial under centOS7
Installation and configuration of redis under centos7
Detailed explanation of quick installation and configuration of Subversion (SVN) under Ubuntu
Centos7 installation of PHP and Nginx tutorial detailed
Installation and configuration of rsync server under CentOS 6.5
CentOS 8 installation of MariaDB detailed tutorial
Java-JDK installation and configuration under CentOS
Centos 7 RAID 5 detailed explanation and configuration
Centos8 minimal deployment and installation of OpenStack Ussuri detailed tutorial
CentOS 7 system installation and configuration graphic tutorial
Tomcat installation and configuration under CentOS 7 (Tomcat startup)
MySQL 8.0 installation, deployment and configuration under CentOS 6/7
Jenkins installation and deployment tutorial under CentOS 7
Centos7 installation and deployment of Airflow detailed
Installation and configuration of JDK in CentOS 7 system
CentOS 7 installation and configuration graphic tutorials under VMware10
Installation and configuration of CentOS 7 in VMware Workstation
MySQL 8.0 installation and deployment under CentOS, super detailed!
MySQL 8.0 installation, deployment and configuration tutorial on CentOS 8
Centos mysql installation and configuration
CentOS 7 installation and configuration PPTP
CentOS installation and configuration cmake
Centos7.5 installation and configuration MongoDB4.0.4
CentOS 7 installation and configuration PPTP
Installation and cracking of confluence6.3 operation records under Centos
Ubuntu16.04 mirror complete installation and configuration tutorial under VMware
Installation and cracking of Jira7 operation records under Centos
Detailed explanation of static DNS configuration under Ubuntu system
Environment configuration of JDK, mysql and tomcat under Centos7
Detailed explanation of CentOS7 network setting tutorial in vmware
Centos7 installation of Dameng database tutorial
Installation under centos6.9 of jenkins learning
CentOS6 minimal installation KVM detailed tutorial
Centos7 hadoop cluster installation and configuration
CentOS7 installation and maintenance of Gitlab
Centos8 installation diagram (super detailed tutorial)
Detailed examples of Centos6 network configuration
CentOS 7 Tomcat service installation and configuration
CentOS NTP server installation and configuration
Erlang 20.2 installation and deployment under CentOS 7
Centos7 mysql database installation and configuration
Detailed explanation of the installation and use of SSH in the Ubuntu environment
Centos7.5 configuration java environment installation tomcat explanation
Centos7 installation and deployment of gitlab server
Installation and uninstallation of CUDA under Ubuntu 16.04
Zabbix installation and deployment and localization under CentOS
Centos7 hive stand-alone mode installation and configuration
CentOS7 installation zabbix 4.0 tutorial (graphics and text)
KVM installation and preliminary use under CentOS 7.2
DLNA/UPnP Server installation and configuration under Ubuntu 12.04
CentOS7 fully automatic installation CD production detailed explanation
Build LEMP (Linux+Nginx+MySQL+PHP) environment under CentOS 8.1 (detailed tutorial)
Install svn and configuration through yum under CentOS
Kaldi installation and configuration graphic tutorials under Ubuntu
CentOS7 Docker Nginx deployment and operation detailed explanation
Installation, configuration and uninstallation of GitLab in Ubuntu19.1
Detailed explanation of building Hadoop environment on CentOS 6.5
Installing CentOS 6 and SSH configuration under Windows 8 Hyper-V
Detailed tutorial of installing nginx on centos8 (graphic)
Non-Root installation of Microsoft R Open under Centos
Distributed deployment of Apollo configuration center under CentOS8