Spark Stand-alone Cluster as a systemd Service (Ubuntu 16.04/CentOS 7)

Introduction

Once completed a stand-alone spark cluster installation, you can start and stop spark cluster using the below commands.

$SPARK_HOME/sbin/start-all.sh
$SPARK_HOME/sbin/stop-all.sh

But this will not be feasible for production level system. Hopefully you may want spark cluster  to

  • start whenever your system starts / reboots
  • automatically restart in case of failures

This can be achieved by adding spark to linux’s init system. If you are not familiar with linux’s new init system systemd , please check one the reference link below.

Spark systemd unit file

All systemd services are driven using a systemd unit file. With your login user,  create a systemd unit file in /etc/systemd/system

cd /etc/systemd/system
sudo nano spark.service

Keep the below code in the spark.service unit file.

[Unit]
Description=Apache Spark Master and Slave Servers
After=network.target
After=systemd-user-sessions.service
After=network-online.target

[Service]
User=spark
Type=forking
ExecStart=/opt/spark-1.6.1-bin-hadoop2.6/sbin/start-all.sh
ExecStop=/opt/spark-1.6.1-bin-hadoop2.6/sbin/stop-all.sh
TimeoutSec=30
Restart= on-failure
RestartSec= 30
StartLimitInterval=350
StartLimitBurst=10

[Install]
WantedBy=multi-user.target

As you can see we are using the start & stop script to start and stop the spark service. Part of the unit file is used for automatic restart.

After creating / modifying the unit file, you should reload the systemd process itself to pick up your changes

sudo systemctl daemon-reload

From now on, if you want to start / stop the spark stand-alone cluster manually, use the below commands.

sudo systemctl start spark.service
sudo systemctl stop spark.service

Once started the service, please check the /var/spark/logs for logs. And execute below command for know the status of sytemd service.

sudo systemctl status spark.service

By default, systemd unit files are not started automatically at boot. To configure this functionality, you need to “enable” to unit.To enable spark service to start automatically at boot, type:

sudo systemctl enable spark.service

By this, you have configured a spark-stand alone cluster as a systemd service with automatic restart. Hope you enjoyed the post.

Reference

https://www.digitalocean.com/community/tutorials/systemd-essentials-working-with-services-units-and-the-journal

https://github.com/thiagowfx/PKGBUILDs/blob/master/apache-spark/apache-spark-standalone.service

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s