Gregor Godbersen : Snake Cluster Box: Portable Computation Cluster
Published

Snake Cluster Box: Portable Computation Cluster

We introduce a Snakemake and Conda enabled, multi-node Slurm computation cluster that can be launched from a single USB stick and operates entirely in RAM. This setup enables us to temporarily borrow any computational resources from a lab. This software enables any researcher to establish a robust computation cluster using a straightforward configuration file, eliminating the need for specialized systems engineering or administration knowledge. Moreover, it diminishes reliance on centrally administered resources.

The source and more instructions can be found in the GitHub repository.

Introduction

We developed a solution using a customized Linux distribution that is bootable from a USB stick. After booting, the system runs entirely from a volatile RAM disk. As no changes are made to the existing system, i.e., the hard disk is never mounted, a simple reboot changes the system back into the original state. This allows us to borrow any computation resource temporarily. We employed NixOS, a Linux distribution based on the Nix package manager that allows for a fully declarative Linux system configuration, to create our custom Linux system. Leveraging a NixOS feature known as specializations, we defined multiple system configuration variants, each tailored for a potential compute node. These specializations are nearly identical, differing only in authentication information and network configuration. Users can choose the desired system configuration variant in the bootloader menu. Upon booting, the entire operating system is copied to the PC’s memory, enabling the removal of the USB stick once the boot sequence is complete. A single USB stick can thus be used to start multiple compute nodes in sequence. While storing the operating system in-memory reduces available memory by approximately 700 MB, this limitation is inconsequential for operations research problems, which are typically compute-bound rather than memory-bound.

At the core of the USB stick builder is a configuration file in the TOML format. We presume an ethernet network configuration where network administrators assign static IP addresses. The configuration file, therefore, encompasses details such as the static network gateway, authentication information, and a roster of hosts, along with their respective assigned static IP addresses.

[network]
gateway = "10.152.41.254"
wireguard_net_prefix = "10.0.2."
dns = ...

[access]
main_user = "gu53rab2"
root_pw = ...
sshkeys = [ "ecdsa-sha2-nistp521 AAAAE2Vj.." ]


[[hosts]]
# there must be a single host called master
name = "master"
worker = true
static_ip = "10.152.41.80"

[[hosts]]
name = "worker1"
worker = true
static_ip = "10.152.41.81"

The builder script employs the information in the configuration file to construct a customized Linux image suitable for burning onto a CD-ROM or USB stick. An NFS File Server is initiated on the primary node to facilitate mounting a shared home folder across all worker nodes. Furthermore, a Slurm Control Server is launched on the primary node, accompanied by Slurm Compute Nodes on each worker node. A WireGuard Virtual Private Network (VPN) spanning all hosts is established for enhanced network security, and other connections are restricted through the system firewall. During the USB stick image build process, we automatically generate WireGuard private keys and configure a mesh network. Leveraging the NixOS feature called UserFHS, we create an isolated software environment containing solely the Conda package manager for the Slurm Compute Service. This approach ensures that only minimal system software is available in jobs submitted to the Slurm cluster, compelling users to define a reproducible environment through the Conda environment manager. In our research workflows, we employ Snakemake to specify preprocessing and computation pipelines declaratively. To streamline this process, we preconfigure Snakemake on the primary node to automatically submit all jobs to the local Slurm cluster. Given Snakemake’s native support for the Conda package manager, it automatically guarantees the recreation of an identical software environment on each compute node. As NixOS is fully reproducible, we ensure an identical software environment down to the Linux kernel level.

Utilizing the cluster is a straightforward process. We first identify a set of available machines, preferably with identical hardware configurations. Following this, we adjust the TOML configuration and build the custom USB stick image. We connect the USB stick and reboot for each machine into our customized Linux distribution. In the boot menu, we select the machine’s name, briefly wait for the completion of the boot process, remove the USB stick, and proceed to the next machine. The nodes seamlessly join the WireGuard mesh network and register with the Slurm Control Node. Once all machines are configured, we use SSH to connect to the primary node, retrieve our experiments from a network repository, execute the Snakemake run command, and wait for the completion of all jobs. After completing all jobs, we retrieve the job data from the primary node and reboot the machines. The rebooted machines then start in their original configuration. Notably, the Slurm architecture permits adding and removing worker nodes at any time, automatically redistributing ongoing jobs.

Build Prerequisites

Preparing the USB Stick

The GitHub repository provides the tools to build a custom ISO that can be burned onto a USB stick. The network configuration and authentication information are embedded in the image. No job or experiment specific data will be included. Consequently, the USB stick can be reused for various projects within the same lab.

  • Clone the repository and adapt the Configuration File to suit your requirements. All fields except for root_pw are mandatory.

  • Execute ./buildIso.sh to initiate the build process. This involves downloading and constructing a Linux image from scratch, which may take some time. Once completed, the ISO image can be located in the result folder.

  • Burn the ISO onto a USB stick or CD.

  • This stick can be used to boot any node of the cluster. As the cluster runs entirely in RAM, the stick can be removed after boot to start multiple machines in sequence. The appropriate profile can be chosen in the boot menu. The master node must always be booted first.

Submitting Jobs & Software Envrionments

The cluster behaves like a normal SLURM cluster. We have made some extensions for and generally prefer running Snakemake workflows. The master node contains a slurmmake binary, behaving like the standard snakemake binary but automatically submitting all jobs to the Slurm cluster. Additionally, it enables the Conda package manager.

A examplary workflow could therefore look as follows:

ssh master_node -A
git clone https://example.com/research
cd research
slurmmake

Note that the entire cluster runs in RAM. Any results must be extracted from the master node before reboot!

Trust and Security model

We adhere to the following trust model:

  1. The system is reasonably secure against other users on the network.
    • The cluster’s internal communications are secured using a Wireguard VPN mesh network.
    • Outside access is restricted to SSH using public key authentication, with all other ports closed by a firewall.
  2. We assume that any authorized user (SSH, local root) is trustworthy.
  3. The system boots into a login screen, and the machine can be left unattended with a strong root_password set.
  4. The USB stick contains all secrets, and if lost, the system is compromised.
  5. This software is experimental and has not undergone a security audit.