Getting started with Charmed HPC¶

This tutorial takes you through multiple aspects of Charmed HPC, such as:

Building a small Charmed HPC cluster with a shared filesystem
Preparing and submitting a multi-node batch job to your Charmed HPC cluster’s workload scheduler
Creating and using a container image to provide the runtime environment for a submitted batch job

By the end of this tutorial, you will have worked with a variety of open source projects, such as:

Multipass
Juju
Charms
Apptainer
Ceph
Slurm

This tutorial assumes that you have had some exposure to high-performance computing concepts such as batch scheduling, but does not assume prior experience building HPC clusters. This tutorial also does not expect you to have any prior experience with Multipass, Juju, Apptainer, Ceph, or Slurm.

Using Charmed HPC in production

The Charmed HPC cluster built in this tutorial is for learning purposes and should not be used as the basis for a production HPC cluster. For more in-depth steps on how to deploy a fully operational Charmed HPC cluster, see Charmed HPC’s How-to guides

Prerequisites¶

To successfully complete this tutorial, you will need:

At least 8 CPU cores, 16GB RAM, and 40GB storage available
Multipass installed
An active internet connection

Create a virtual machine with Multipass¶

First, download a copy of the cloud initialization (cloud-init) file, charmed-hpc-tutorial-cloud-init.yml, that defines the underlying cloud infrastructure for the virtual machine. For this tutorial, the file includes instructions for creating and configuring your LXD machine cloud localhost with the charmed-hpc-controller Juju controller and creating workload and submit scripts for the example jobs. The cloud-init step will be completed as part of the virtual machine launch and will not be something you need to set up manually. You can expand the dropdown below to view the full cloud-init file before downloading onto your local system:

charmed-hpc-tutorial-cloud-init.yml

charmed-hpc-tutorial-cloud-init.yml¶

#cloud-config

# Ensure VM is fully up-to-date multipass does not support reboots.
# See: https://github.com/canonical/multipass/issues/4199
# Package management
package_reboot_if_required: false
package_update: true
package_upgrade: true

# Install prerequisites
snap:
  commands:
    00: snap install juju --channel=3/stable
    01: snap install lxd --channel=6/stable

# Configure and initialize prerequisites
lxd:
  init:
    storage_backend: dir

# Commands to run at the end of the cloud-init process
runcmd:
  - lxc network set lxdbr0 ipv6.address none
  - su ubuntu -c 'juju bootstrap localhost charmed-hpc-controller'

# Write files to the Multipass instance
write_files:
  # MPI workload dependencies
  - path: /home/ubuntu/mpi_hello_world.c
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      #include <mpi.h>
      #include <stdio.h>
      
      int main(int argc, char** argv) {
          // Initialize the MPI environment
          MPI_Init(NULL, NULL);

          // Get the number of nodes
          int size;
          MPI_Comm_size(MPI_COMM_WORLD, &size);

          // Get the rank of the process
          int rank;
          MPI_Comm_rank(MPI_COMM_WORLD, &rank);

          // Get the name of the node
          char node_name[MPI_MAX_PROCESSOR_NAME];
          int name_len;
          MPI_Get_processor_name(node_name, &name_len);

          // Print hello world message
          printf("Hello world from node %s, rank %d out of %d nodes\n",
                 node_name, rank, size);

          // Finalize the MPI environment.
          MPI_Finalize();
      }
  - path: /home/ubuntu/submit_hello.sh
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      #!/usr/bin/env bash
      #SBATCH --job-name=hello_world
      #SBATCH --partition=tutorial-partition
      #SBATCH --nodes=2
      #SBATCH --error=error.txt
      #SBATCH --output=output.txt

      mpirun ./mpi_hello_world
  # Container workload dependencies.
  - path: /home/ubuntu/generate.py
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      #!/usr/bin/env python3

      """Generate example dataset for workload."""

      import argparse

      from faker import Faker
      from faker.providers import DynamicProvider
      from pandas import DataFrame


      faker = Faker()
      favorite_lts_mascot = DynamicProvider(
          provider_name="favorite_lts_mascot",
          elements=[
              "Dapper Drake",
              "Hardy Heron",
              "Lucid Lynx",
              "Precise Pangolin",
              "Trusty Tahr",
              "Xenial Xerus",
              "Bionic Beaver",
              "Focal Fossa",
              "Jammy Jellyfish",
              "Noble Numbat",
          ],
      )
      faker.add_provider(favorite_lts_mascot)


      def main(rows: int) -> None:
          df = DataFrame(
              [
                  [faker.email(), faker.country(), faker.favorite_lts_mascot()]
                  for _ in range(rows)
              ],
              columns=["email", "country", "favorite_lts_mascot"],
          )
          df.to_csv("favorite_lts_mascot.csv")


      if __name__ == "__main__":
          parser = argparse.ArgumentParser()
          parser.add_argument(
              "--rows", type=int, default=1, help="Rows of fake data to generate"
          )
          args = parser.parse_args()

          main(rows=args.rows)
  - path: /home/ubuntu/workload.py
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      #!/usr/bin/env python3
      
      """Plot the most popular Ubuntu LTS mascot."""
      
      import argparse
      import os
      
      import pandas as pd
      import plotext as plt
      
      def main(dataset: str | os.PathLike, file: str | os.PathLike) -> None:
          df = pd.read_csv(dataset)
          mascots = df["favorite_lts_mascot"].value_counts().sort_index()
      
          plt.simple_bar(
              mascots.index,
              mascots.values,
              title="Favorite LTS mascot",
              color="orange",
              width=150,
          )
      
          if file:
              plt.save_fig(
                  file if os.path.isabs(file) else f"{os.getcwd()}/{file}",
                  keep_colors=True
              )
          else:
              plt.show()
      
      if __name__ == "__main__":
          parser = argparse.ArgumentParser()
          parser.add_argument("dataset", type=str, help="Path to CSV dataset to plot")
          parser.add_argument(
              "-o",
              "--output",
              type=str,
              default="",
              help="Output file to save plotted graph",
          )
          args = parser.parse_args()
      
          main(args.dataset, args.output)
  - path: /home/ubuntu/workload.def
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      bootstrap: docker
      from: ubuntu:24.04

      %files
          generate.py /usr/bin/generate
          workload.py /usr/bin/workload

      %environment
          export PATH=/usr/bin/venv/bin:${PATH}
          export PYTHONPATH=/usr/bin/venv:${PYTHONPATH}

      %post
          export DEBIAN_FRONTEND=noninteractive
          apt-get update -y
          apt-get install -y python3-dev python3-venv

          python3 -m venv /usr/bin/venv
          alias python3=/usr/bin/venv/bin/python3
          alias pip=/usr/bin/venv/bin/pip

          pip install -U faker
          pip install -U pandas
          pip install -U plotext

          chmod 755 /usr/bin/generate
          chmod 755 /usr/bin/workload

      %runscript
          exec workload "$@"
  - path: /home/ubuntu/submit_apptainer_mascot.sh
    owner: ubuntu:ubuntu
    permissions: !!str "0664"
    defer: true
    content: |
      #!/usr/bin/env bash
      #SBATCH --job-name=favorite-lts-mascot
      #SBATCH --partition=tutorial-partition
      #SBATCH --nodes=2
      #SBATCH --error=mascot_error.txt
      #SBATCH --output=mascot_output.txt

      apptainer exec workload.sif generate --rows 1000000
      apptainer run workload.sif favorite_lts_mascot.csv --output graph.out

From the local directory holding the cloud-init file, launch a virtual machine using Multipass:

ubuntu@local:~$

multipass launch 24.04 --name charmed-hpc-tutorial --cloud-init charmed-hpc-tutorial-cloud-init.yml --memory 16G --disk 40G --cpus 8 --timeout 1000

The virtual machine launch process should take five minutes or less to complete, but may take longer due to network strength. Upon completion of the launch process, check the status of cloud-init to confirm that all processes completed successfully.

Enter the virtual machine:

ubuntu@local:~$

multipass shell charmed-hpc-tutorial

Then check cloud-init status:

ubuntu@charmed-hpc-tutorial:~$

cloud-init status --long

status: done
extended_status: done
boot_status_code: enabled-by-genertor
last_update: Thu, 01 Jan 1970 00:03:45 +0000
detail: DataSourceNoCloud [seed=/dev/sr0]
errors: []
recoverable_errors: {}

If the status shows done and there are no errors, then you are ready to move on to deploying the cluster charms.

Deploy Slurm and shared filesystem¶

Next, you will deploy Slurm and the filesystem. The Slurm components of your deployment will be composed of:

The Slurm management daemon: slurmctld
Two Slurm compute daemons: slurmd, grouped in a partition named tutorial-partition
The authentication and credential kiosk daemon: sackd to provide the login node

First, create the slurm model on your cloud localhost:

ubuntu@charmed-hpc-tutorial:~$

juju add-model slurm localhost

Then deploy the Slurm components:

ubuntu@charmed-hpc-tutorial:~$

juju deploy slurmctld --base "ubuntu@24.04" --channel "edge" --constraints="virt-type=virtual-machine"

ubuntu@charmed-hpc-tutorial:~$

juju deploy slurmd tutorial-partition -n 2 --base "ubuntu@24.04" --channel "edge" --constraints="virt-type=virtual-machine"

ubuntu@charmed-hpc-tutorial:~$

juju deploy sackd --base "ubuntu@24.04" --channel "edge" --constraints="virt-type=virtual-machine"

And integrate them together:

ubuntu@charmed-hpc-tutorial:~$

juju integrate slurmctld sackd

ubuntu@charmed-hpc-tutorial:~$

juju integrate slurmctld tutorial-partition

Next, you will deploy the filesystem pieces, which are:

the distributed storage system: microceph
ceph-fs to expose the MicroCeph cluster as a shared filesystem using CephFS
filesystem-client to mount the filesystem, named scratch

ubuntu@charmed-hpc-tutorial:~$

juju deploy microceph --channel latest/edge --constraints="virt-type=virtual-machine mem=4G root-disk=20G"

ubuntu@charmed-hpc-tutorial:~$

juju deploy ceph-fs --channel latest/edge

ubuntu@charmed-hpc-tutorial:~$

juju deploy filesystem-client scratch --channel latest/edge --config mountpoint=/scratch

ubuntu@charmed-hpc-tutorial:~$

juju add-storage microceph/0 osd-standalone=loop,2G,3

And then integrate the filesystem components together:

ubuntu@charmed-hpc-tutorial:~$

juju integrate scratch ceph-fs

ubuntu@charmed-hpc-tutorial:~$

juju integrate ceph-fs microceph

ubuntu@charmed-hpc-tutorial:~$

juju integrate scratch tutorial-partition

ubuntu@charmed-hpc-tutorial:~$

juju integrate sackd scratch

After a few minutes, the Slurm deployment will become active. The output of the juju status command should be similar to the following:

ubuntu@charmed-hpc-tutorial:~$

juju status

Model  Controller              Cloud/Region         Version  SLA          Timestamp
slurm  charmed-hpc-controller  localhost/localhost  3.6.9    unsupported  10:53:50-04:00

App                 Version          Status  Scale  Charm              Channel      Rev  Exposed  Message
ceph-fs             19.2.1           active      1  ceph-fs            latest/edge  196  no       Unit is ready
scratch                              active      3  filesystem-client  latest/edge   20  no       Integrated with `cephfs` provider
microceph                            active      1  microceph          latest/edge  159  no       (workload) charm is ready
sackd               23.11.4-1.2u...  active      1  sackd              latest/edge   38  no
slurmctld           23.11.4-1.2u...  active      1  slurmctld          latest/edge  120  no       primary - UP
tutorial-partition  23.11.4-1.2u...  active      2  slurmd             latest/edge  141  no

Unit                   Workload  Agent  Machine  Public address  Ports          Message
ceph-fs/0*             active    idle   5        10.248.240.129                 Unit is ready
microceph/0*           active    idle   4        10.248.240.102                 (workload) charm is ready
sackd/0*               active    idle   3        10.248.240.49   6818/tcp
  scratch/0*           active    idle            10.248.240.49                  Mounted filesystem at `/scratch`
slurmctld/0*           active    idle   0        10.248.240.162  6817,9092/tcp  primary - UP
tutorial-partition/0   active    idle   1        10.248.240.218  6818/tcp
  scratch/2            active    idle            10.248.240.218                 Mounted filesystem at `/scratch`
tutorial-partition/1*  active    idle   2        10.248.240.130  6818/tcp
  scratch/1            active    idle            10.248.240.130                 Mounted filesystem at `/scratch`

Machine  State    Address         Inst id        Base          AZ                       Message
0        started  10.248.240.162  juju-2586ad-0  [email protected]  charmed-hpc-tutorial  Running
1        started  10.248.240.218  juju-2586ad-1  [email protected]  charmed-hpc-tutorial  Running
2        started  10.248.240.130  juju-2586ad-2  [email protected]  charmed-hpc-tutorial  Running
3        started  10.248.240.49   juju-2586ad-3  [email protected]  charmed-hpc-tutorial  Running
4        started  10.248.240.102  juju-2586ad-4  [email protected]  charmed-hpc-tutorial  Running
5        started  10.248.240.129  juju-2586ad-5  [email protected]  charmed-hpc-tutorial  Running

Get compute nodes ready for jobs¶

Now that Slurm and the filesystem have been successfully deployed, the next step is to set up the compute nodes themselves. The compute nodes must be moved from the down state to the idle state so that they can start having jobs ran on them. First, check that the compute nodes are still down, which will show something similar to:

user@host:~$

juju exec -u sackd/0 -- sinfo

PARTITION         AVAIL  TIMELIMIT  NODES  STATE NODELIST
tutorial-partition    up   infinite      2   down juju-e16200-[1-2]

Then, bring up the compute nodes:

ubuntu@charmed-hpc-tutorial:~$

juju run tutorial-partition/0 node-configured

ubuntu@charmed-hpc-tutorial:~$

juju run tutorial-partition/1 node-configured

And verify that the STATE is now set to idle, which should now show:

ubuntu@charmed-hpc-tutorial:~$

juju exec -u sackd/0 -- sinfo

PARTITION         AVAIL  TIMELIMIT  NODES  STATE NODELIST
tutorial-parition    up   infinite      2   idle juju-e16200-[1-2]

Copy files onto cluster¶

The workload files that were created during the cloud initialization step now need to be copied onto the cluster filesystem from the virtual machine filesystem. First you will make the new example directories, then set appropriate permissions, and finally copy the files over:

ubuntu@charmed-hpc-tutorial:~$

juju exec -u sackd/0 -- sudo mkdir /scratch/mpi_example /scratch/apptainer_example

ubuntu@charmed-hpc-tutorial:~$

juju exec -u sackd/0 -- sudo chown $USER: /scratch/*

ubuntu@charmed-hpc-tutorial:~$

juju scp submit_hello.sh mpi_hello_world.c sackd/0:/scratch/mpi_example

ubuntu@charmed-hpc-tutorial:~$

juju scp submit_apptainer_mascot.sh generate.py workload.py workload.def sackd/0:/scratch/apptainer_example

The /scratch directory is mounted on the compute nodes and will be used to read and write from during the batch jobs.

Run a batch job¶

In the following steps, you will compile a small Hello World MPI script and run it by submitting a batch job to Slurm.

Compile¶

First, SSH into the login node, sackd/0:

ubuntu@charmed-hpc-tutorial:~$

juju ssh sackd/0

This will place you in your home directory /home/ubuntu. Next, you will need to move to the /scratch/mpi_example directory, install the Open MPI libraries need for compiling, and then compile the mpi_hello_world.c file by running the mpicc command:

ubuntu@login:~$

cd /scratch/mpi_example

ubuntu@login:~$

sudo apt install build-essential openmpi-bin libopenmpi-dev

ubuntu@login:~$

mpicc -o mpi_hello_world mpi_hello_world.c

For quick referencing, the two files for the MPI Hello World example are provided in dropdowns here:

Submit batch job¶

Now, submit your batch job to the queue using sbatch:

ubuntu@login:~$

sbatch submit_hello.sh

You job will complete after a few seconds. The generated output.txt file will look similar to the following:

ubuntu@login:~$

cat output.txt

Hello world from processor juju-640476-1, rank 0 out of 2 processors
Hello world from processor juju-640476-2, rank 1 out of 2 processors

The batch job successfully spread the MPI job across two nodes that were able to report back their MPI rank to a shared output file.

Run a container job¶

Next you will go through the steps to generate a random sample of Ubuntu mascot votes and plot the results. The process requires Python and few specific libraries so you will use Apptainer to build a container job and run the job on the cluster.

Set up Apptainer¶

Apptainer must be deployed and integrated with the existing Slurm deployment using Juju and these steps need to be completed from charmed-hpc-tutorial environment; to return to that environment from within sackd/0, use the exit command.

Deploy and integrate Apptainer:

ubuntu@charmed-hpc-tutorial:~$

juju deploy apptainer

ubuntu@charmed-hpc-tutorial:~$

juju integrate apptainer tutorial-partition

ubuntu@charmed-hpc-tutorial:~$

juju integrate apptainer sackd

ubuntu@charmed-hpc-tutorial:~$

juju integrate apptainer slurmctld

After a few minutes, juju status should look similar to the following:

ubuntu@charmed-hpc-tutorial:~$

juju status

Model  Controller              Cloud/Region         Version  SLA          Timestamp
slurm  charmed-hpc-controller  localhost/localhost  3.6.9    unsupported  17:34:46-04:00

App                 Version          Status  Scale  Charm              Channel        Rev  Exposed  Message
apptainer           1.4.2            active      3  apptainer          latest/stable    6  no       
ceph-fs             19.2.1           active      1  ceph-fs            latest/edge    196  no       Unit is ready
scratch                              active      3  filesystem-client  latest/edge     20  no       Integrated with `cephfs` provider
microceph                            active      1  microceph          latest/edge    161  no       (workload) charm is ready
sackd               23.11.4-1.2u...  active      1  sackd              latest/edge     38  no       
slurmctld           23.11.4-1.2u...  active      1  slurmctld          latest/edge    120  no       primary - UP
tutorial-partition  23.11.4-1.2u...  active      2  slurmd             latest/edge    141  no       

Unit                   Workload  Agent  Machine  Public address  Ports          Message
ceph-fs/0*             active    idle   5        10.196.78.232                  Unit is ready
microceph/1*           active    idle   6        10.196.78.238                  (workload) charm is ready
sackd/0*               active    idle   3        10.196.78.117   6818/tcp       
  apptainer/2          active    idle            10.196.78.117                  
  scratch/2            active    idle            10.196.78.117                  Mounted filesystem at `/scratch`
slurmctld/0*           active    idle   0        10.196.78.49    6817,9092/tcp  primary - UP
tutorial-partition/0   active    idle   1        10.196.78.244   6818/tcp       
  apptainer/0          active    idle            10.196.78.244                  
  scratch/0*           active    idle            10.196.78.244                  Mounted filesystem at `/scratch`
tutorial-partition/1*  active    idle   2        10.196.78.26    6818/tcp       
  apptainer/1*         active    idle            10.196.78.26                   
  scratch/1            active    idle            10.196.78.26                   Mounted filesystem at `/scratch`

Machine  State    Address        Inst id        Base          AZ                       Message
0        started  10.196.78.49   juju-808105-0  [email protected]  charmed-hpc-tutorial  Running
1        started  10.196.78.244  juju-808105-1  [email protected]  charmed-hpc-tutorial  Running
2        started  10.196.78.26   juju-808105-2  [email protected]  charmed-hpc-tutorial  Running
3        started  10.196.78.117  juju-808105-3  [email protected]  charmed-hpc-tutorial  Running
5        started  10.196.78.232  juju-808105-5  [email protected]  charmed-hpc-tutorial  Running
6        started  10.196.78.238  juju-808105-6  [email protected]  charmed-hpc-tutorial  Running

Build the container image using `apptainer`¶

Before you can submit your container workload to your Charmed HPC cluster, you must build the container image from the build recipe. The build recipe file workload.def defines the environment and libraries that will be in the container image.

To build the image, return to the cluster login node, move to the example directory, and call apptainer build:

ubuntu@login:~$

juju ssh sackd/0

ubuntu@login:~$

cd /scratch/apptainer_example

ubuntu@login:~$

apptainer build workload.sif workload.def

The files for the Apptainer Mascot Vote example are provided here for reference.

Use the image to run jobs¶

Now that you have built the container image, you can submit a job to the cluster that uses the new workload.sif image to generate one million lines in a table and then uses the resulting favorite_lts_mascot.csv to build the bar plot:

ubuntu@login:~$

sbatch submit_apptainer_mascot.sh

To view the status of the job while it is running, run squeue.

Once the job has completed, view the generated bar plot that will look similar to the following:

ubuntu@login:~$

cat graph.out

────────────────────── Favorite LTS mascot ───────────────────────
│Bionic Beaver    ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 101124.00
│Dapper Drake     ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99889.00
│Focal Fossa      ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99956.00
│Hardy Heron      ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99872.00
│Jammy Jellyfish  ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99848.00
│Lucid Lynx       ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99651.00
│Noble Numbat     ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 100625.00
│Precise Pangolin ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99670.00
│Trusty Tahr      ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99366.00
│Xenial Xerus     ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 99999.00

Summary and clean up¶

Is this tutorial, you:

Deployed and integrated Slurm and a shared filesystem
Launched an MPI batch job and saw cross-node communicated results
Built a container image with Apptainer and used it to run a batch job and generate a bar plot

Now that you have completed the tutorial, if you would like to completely remove the virtual machine, return to your local terminal and multipass delete the virtual machine as follows:

ubuntu@local:~$

multipass delete -p charmed-hpc-tutorial

Next steps¶

Now that you have gotten started with Charmed HPC, check out the Explanation section for details on important concepts and the How-to guides for how to use more of Charmed HPC’s features.

Getting started with Charmed HPC¶

Prerequisites¶

Create a virtual machine with Multipass¶

Deploy Slurm and shared filesystem¶

Get compute nodes ready for jobs¶

Copy files onto cluster¶

Run a batch job¶

Compile¶

Submit batch job¶

Run a container job¶

Set up Apptainer¶

Build the container image using apptainer¶

Use the image to run jobs¶

Summary and clean up¶

Next steps¶

Build the container image using `apptainer`¶