FAQ | Troubleshooting | Glossary
This guide focuses on setting up a federation of Slurm clusters and Slurm multi-cluster.
Federation is a superset of multi-cluster. By setting up federation, you are also setting up multi-cluster.
If using slurm_cluster terraform module, please refer to multiple-slurmdbd section.
NOTE: slurmdbd and the database (e.g. mariadb, mysql, etc..).
Slurm includes support for creating a federation of clusters and scheduling jobs in a peer-to-peer fashion between them. Jobs submitted to a federation receive a unique job ID that is unique among all clusters in the federation. A job is submitted to the local cluster (the cluster defined in the slurm.conf) and is then replicated across the clusters in the federation. Each cluster then independently attempts to the schedule the job based off of its own scheduling policies. The clusters coordinate with the "origin" cluster (cluster the job was submitted to) to schedule the job.
Each cluster in the federation independently attempts to schedule each job with the exception of coordinating with the origin cluster (cluster where the job was submitted to) to allocate resources to a federated job. When a cluster determines it can attempt to allocate resources for a job it communicates with the origin cluster to verify that no other cluster is attempting to allocate resources at the same time.
Slurm offers the ability to target commands to other clusters instead of, or in addition to, the local cluster on which the command is invoked. When this behavior is enabled, users can submit jobs to one or many clusters and receive status from those remote clusters.
When sbatch, salloc or srun is invoked with a cluster list, Slurm will immediately submit the job to the cluster that offers the earliest start time subject its queue of pending and running jobs. Slurm will make no subsequent effort to migrate the job to a different cluster (from the list) whose resources become available when running jobs finish before their scheduled end times.
- Use Slurmdbd
- All clusters must be able to communicate with each slurmdbd and slurmctld.
- slurmdbd to database forms a one-to-one relationship.
- Each cluster must be able to communicate with
slurmdbd.
- Either all clusters and slurmdbd uses the same MUNGE key.
- Or, all clusters have a different MUNGE key and an alternative authentication method for slurmdbd.
- (Optional) Login nodes must be able to directly communicate with compute nodes (otherwise srun and salloc will fail).
-
Deploy slurmdbd and database (e.g. mariadb, mysql, etc..).
-
Deploy Slurm clusters by any chosen methods (e.g. cloud, hybrid, etc..).
WARNING: This type of configuration is not supported by slurm_cluster terraform module; see the multiple-slurmdbd section instead.
-
Update slurm.conf with accounting storage options:
# slurm.conf AccountingStorageHost=<HOSTNAME/IP> AccountingStoragePort=<HOST_PORT> AccountingStorageUser=<USERNAME> AccountingStoragePass=<PASSWORD>
-
Add clusters into federation.
sacctmgr add federation <federation_name> [clusters=<list_of_clusters>]
- User UID and GID are consistant accross all federated clusters.
-
Deploy slurmdbds and databases (e.g. mariadb, mysql, etc..).
NOTE: slurm_cluster terraform module conflates the controller instance and the database instance.
-
Deploy Slurm clusters by any chosen methods (e.g. cloud, hybrid, etc..).
WARNING: If using the slurm_cluster terraform module, do not use the
cloudsql
input, as this does not work with a federation setup. -
Update each slurm.conf with:
# slurm.conf AccountingStorageExternalHost=<host/ip>[:port][,<host/ip>[:port]]
-
Add clusters into federation.
sacctmgr add federation <federation_name> [clusters=<list_of_clusters>]
- All clusters must know where each slurmdbd is.