Do You Need Ubuntu Server for Slurm? A Practical Guide

Slurm, or Simple Linux Utility for Resource Management, is a powerful open-source job scheduler designed for large and small Linux clusters. If you’re exploring high-performance computing (HPC) or need to manage computational tasks efficiently, you might be wondering: do you need Ubuntu Server to run Slurm? While Ubuntu Server is a popular choice, the answer isn’t a straightforward yes or no. This guide will walk you through installing Slurm on Ubuntu, even without root (sudo) privileges, and discuss the role of Ubuntu Server in this context.

Understanding Slurm and Ubuntu in HPC

Slurm is essential for managing workloads on clusters, allowing users to submit, manage, and monitor jobs across many nodes. It handles resource allocation, job scheduling, and task execution, making it a cornerstone of many HPC environments. Ubuntu Server, on the other hand, is a widely adopted Linux distribution known for its stability, extensive software repositories, and strong community support. It’s a natural fit for server environments and is frequently used in HPC setups.

While Ubuntu Server provides a robust and well-supported platform for Slurm, it’s not strictly mandatory. Slurm is designed to run on various Linux distributions. The choice of Ubuntu Server often comes down to its ease of use, readily available packages, and the perception of stability crucial for server infrastructure. However, for learning, testing, or even running smaller-scale Slurm setups, you can absolutely use other Linux distributions, or even install Slurm locally on an Ubuntu system where you lack administrative rights.

Installing Slurm Locally on Ubuntu Without Sudo

The following steps outline how to install Slurm in your home directory on an Ubuntu system, bypassing the need for sudo privileges. This approach is ideal for personal learning environments or situations where you don’t have root access to a shared server.

Prerequisites: Local Dependency Installation

Installing system-level dependencies like gcc, make, munge, and others without sudo is indeed challenging. Typically, you would need to compile these from source and install them in your home directory.

mkdir -p $HOME/local
echo 'export PATH=$HOME/local/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$HOME/local/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export PKG_CONFIG_PATH=$HOME/local/lib/pkgconfig:$PKG_CONFIG_PATH' >> ~/.bashrc
source ~/.bashrc

For each dependency, you would download the source code, configure it with --prefix=$HOME/local, and then run make and make install. For instance, for GCC:

wget https://ftp.gnu.org/gnu/gcc/gcc-11.2.0/gcc-11.2.0.tar.gz
tar xzf gcc-11.2.0.tar.gz
cd gcc-11.2.0
./contrib/download_prerequisites
mkdir build && cd build
../configure --prefix=$HOME/local --enable-languages=c,c++ --disable-multilib
make -j$(nproc)
make install

This process is repeated for make, munge, libmunge-dev, libmunge2, libpam0g-dev, libmysqlclient-dev, libssl-dev, and libncurses5-dev. It’s a complex and time-consuming process, and if possible, requesting your system administrator to install these system-wide would be significantly easier.

Alt text: Downloading GCC 11.2.0 source code, a step in local installation without sudo privileges, highlighting the complexity of manual dependency management.

Configuring Munge Locally

Munge is crucial for Slurm authentication. Here’s how to configure it locally:

wget https://github.com/dun/munge/releases/download/munge-0.5.14/munge-0.5.14.tar.xz
tar xf munge-0.5.14.tar.xz
cd munge-0.5.14
./configure --prefix=$HOME/local --sysconfdir=$HOME/local/etc --localstatedir=$HOME/local/var
make
make install

Update your PATH to include Munge binaries:

echo 'export PATH=$HOME/local/sbin:$PATH' >> ~/.bashrc
source ~/.bashrc

Generate the Munge key and set permissions:

dd if=/dev/urandom of=$HOME/local/etc/munge.key bs=1 count=1024
chmod 400 $HOME/local/etc/munge.key

Create necessary directories and environment variables for Munge:

mkdir -p $HOME/local/etc/munge $HOME/local/var/run/munge $HOME/local/var/log/munge
echo 'export MUNGEUSER=$(whoami)' >> ~/.bashrc
echo 'export MUNGE_PID_FILE=$HOME/local/var/run/munge/munged.pid' >> ~/.bashrc
echo 'export MUNGE_LOG_FILE=$HOME/local/var/log/munge/munged.log' >> ~/.bashrc
source ~/.bashrc

Start the Munge service locally:

munged --sysconfdir=$HOME/local/etc --key-file=$HOME/local/etc/munge.key --socket=$HOME/local/var/run/munge/munge.socket.2 --nthreads=2 --pidfile=$MUNGE_PID_FILE --log-file=$MUNGE_LOG_FILE --user=$MUNGEUSER

Alt text: Downloading Munge 0.5.14 source, illustrating the local configuration process required for Slurm authentication without sudo.

Downloading and Compiling Slurm

Download and compile Slurm itself:

wget https://download.schedmd.com/slurm/slurm-21.08.5.tar.bz2
tar xjf slurm-21.08.5.tar.bz2
cd slurm-21.08.5
./configure --prefix=$HOME/slurm
make
make install

Update your PATH to include Slurm binaries:

echo 'export PATH=$HOME/slurm/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Alt text: Downloading Slurm 21.08.5 source code, depicting the compilation stage necessary for a sudo-less local Slurm installation on Ubuntu.

Setting Up Slurm Configuration

Create a minimal Slurm configuration file:

mkdir -p $HOME/slurm/etc
nano $HOME/slurm/etc/slurm.conf

Paste the following content into slurm.conf, replacing <your_hostname> with your actual hostname:

ControlMachine=<your_hostname>
AuthType=auth/munge
CryptoType=crypto/munge
MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=$HOME/slurm/var/run/slurmctld.pid
SlurmctldPort=50000 # Use a high-numbered port
SlurmdPidFile=$HOME/slurm/var/run/slurmd.pid
SlurmdPort=50001 # Use a high-numbered port
SlurmdSpoolDir=$HOME/slurm/var/spool/slurmd
SlurmUser=$(whoami)
StateSaveLocation=$HOME/slurm/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0

PartitionName=debug
Nodes=<your_hostname>
Default=YES
MaxTime=INFINITE
State=UP

NodeName=<your_hostname>
CPUs=1
State=UNKNOWN

Important: Since you are running without sudo, you must use high-numbered ports for SlurmctldPort and SlurmdPort (e.g., 50000 and 50001) to avoid conflicts with system-level services.

Create Slurm directories:

mkdir -p $HOME/slurm/var/run $HOME/slurm/var/spool/slurmctld $HOME/slurm/var/spool/slurmd $HOME/slurm/var/log

Starting Slurm Services

Start the Slurm controller and daemon:

slurmctld -D -c -f $HOME/slurm/etc/slurm.conf &
slurmd -D -c -f $HOME/slurm/etc/slurm.conf &

The -D flag runs services in the foreground, useful for debugging. Remove it to run in the background.

Verifying Your Slurm Installation

To check if Slurm is running correctly, use the sinfo command:

sinfo

This command should display information about the Slurm nodes and partitions, indicating a successful local installation.

Conclusion

While Ubuntu Server is a preferred and excellent operating system for deploying Slurm in production environments due to its robustness and support, it is not strictly necessary, especially for development or personal use. You can successfully install and run Slurm on Ubuntu, even without sudo privileges, by compiling from source and configuring services locally. This method allows you to explore Slurm’s capabilities and experiment with job scheduling in a contained environment. For larger, production-grade clusters, a dedicated server environment like Ubuntu Server, with proper administrative setup, remains the recommended approach for optimal performance and manageability.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *