How to deploy a shared filesystem

Charmed-HPC allows automatic integration with shared filesystems using the filesystem-client charm. This how-to guide shows you how to deploy filesystem-client to integrate with externally managed shared filesystems.

Note

If you plan on using Terraform to handle your deployment, we also provide Terraform modules to setup an NFS server managed by Azure on the charmed-hpc-terraform repository, with an example on how to deploy the module.

Prerequisites

Deploy an external filesystem server

External servers that provide a shared filesystem cannot be integrated directly. Instead, we can use a proxy charm in order to expose the required information to applications managed by Juju.

To integrate with an external NFS server, you will require:

  • An externally managed NFS server.

  • The server’s hostname.

  • The exported path.

  • (optional) the port.

Each public cloud has its own procedure to deploy a public NFS server. Provided here are links to the set up procedures on a few well-known public clouds.

Amazon Web Services

Set up information.

docs.aws.amazon.com
Microsoft Azure

Set up information.

learn.microsoft.com

However, if only a minimal server for testing is necessary, a small NFS server can be set up with LXD.

Deploy an NFS server on LXD

First, launch a virtual machine using LXD:

$ snap install lxd
$ lxd init --auto
$ lxc launch ubuntu:24.04 nfs-server --vm
$ lxc shell nfs-server

Inside the LXD virtual machine, set up an NFS kernel server that exports a /data directory:

apt update && apt upgrade
apt install nfs-kernel-server
mkdir -p /data
cat << 'EOF' > /etc/exports
/srv     *(ro,sync,subtree_check)
/data    *(rw,sync,no_subtree_check,no_root_squash)
EOF
exportfs -a
systemctl restart nfs-kernel-server

Note

You can verify if the NFS server is exporting the desired directories by using the command showmount -e localhost while inside the LXD virtual machine.

Grab the network address of the LXD virtual machine and exit the current shell session:

hostname -I
exit

After gathering all the required information, you can deploy the nfs-server-proxy charm in order to expose the externally managed server inside a Juju model.

juju deploy nfs-server-proxy --config \
    hostname=<server hostname> \
    path=<exported path> \
    port=<server port>

To integrate with an external CephFS share, you will require:

  • The unique identifier of the cluster (commonly known as fsid).

  • The name of the filesystem within the Ceph cluster.

  • The exported path of the filesystem.

  • The list of hostnames for MON nodes of the Ceph cluster.

  • The username with permissions to access the filesystem.

  • The cephx key for the username.

Here, a Ceph cluster will be set up using MicroCeph.

First, launch a virtual machine using LXD:

snap install lxd
lxd init --auto
lxc launch ubuntu:24.04 cephfs-server --vm
lxc shell cephfs-server

Inside the LXD virtual machine, set up MicroCeph to export a Ceph filesystem.

# Setup environment
ln -s /bin/true /usr/local/bin/udevadm
apt-get -y update
apt-get -y install ceph-common jq
snap install microceph

# Bootstrap Microceph
microceph cluster bootstrap

# Add a storage disk to Microceph
microceph disk add loop,2G,3

We will create two new disk pools, then assign the two pools to a new filesystem with the name cephfs.

# Create a new data pool for our filesystem...
microceph.ceph osd pool create cephfs_data

# ... and a metadata pool for the same filesystem.
microceph.ceph osd pool create cephfs_metadata

# Create a new filesystem that uses the two created data pools.
microceph.ceph fs new cephfs cephfs_metadata cephfs_data

We will also use fs-client as the username for the clients, and expose the whole directory tree (/) in read-write mode (rw).

microceph.ceph fs authorize cephfs client.fs-client / rw

Note

You can verify if the CephFS server is working correctly by using the command microceph.ceph fs status cephfs while inside the LXD virtual machine.

To gather the required information for proxying the externally managed Ceph filesystem:

export HOST=$(hostname -I | tr -d '[:space:]'):6789
export FSID=$(microceph.ceph -s -f json | jq -r '.fsid')
export CLIENT_KEY=$(microceph.ceph auth print-key client.fs-client)

Print the required information for reference and then exit the current shell session:

echo $HOST
echo $FSID
echo $CLIENT_KEY
exit

Having collected all the required information, you can deploy the cephfs-server-proxy charm to expose the externally managed Ceph filesystem inside a Juju model.

juju deploy cephfs-server-proxy --config \
    --config fsid=<value of $FSID> \
    --config sharepoint=cephfs:/ \
    --config monitor-hosts="<value of $HOST>" \
    --config auth-info=fs-client:<value of $CLIENT_KEY>

Deploy the filesystem-client

To add the filesystem-client charm, which mounts a shared filesystem to the cluster nodes:

juju deploy filesystem-client --channel latest/edge \
    --config mountpoint='/scratch' \
    --config noexec=true

The mountpoint configuration represents the path that the filesystem will be mounted onto.

filesystem-client is a subordinate charm that can automatically mount any shared filesystems for the application related with it. In this case, we will relate it to the slurmd application in order to have a shared storage between all the compute nodes in the cluster:

juju integrate slurmd:juju-info filesystem-client:juju-info

Relate the filesystem client with the filesystem provider

Every filesystem provider can be integrated with the filesystem client using the filesystem endpoint.

juju integrate filesystem-client:filesystem <filesystem-provider>:filesystem

Afterwards, test that the filesystem is accessible to read and write from the slurmd application machines:

juju ssh slurmd/0 -- touch /scratch/script.py
juju ssh slurmd/1 -- stat /scratch/script.py