Here are a few Slurm examples to aid in getting started with different configurations.
Recall, the VACC clusters use the following partition names:
-
general This partition is the default partition for Slurm, used if no partition is specified, and assigns jobs to most of the nodes included in the cluster.
-
short This is the debugging or "short task" partition. It has a defualt time of 30 minutes, and a maximum time of three hours.
-
nvpu This is the Nvidia GPU partition. It has a maximum time of 48 hours and no memory limit. GPU focused jobs which use the Nvidia GPUs should use this partition. Do not use this partition if your job does not use a GPU
-
hgnodes This is the Nvidia H100 parition, and has restricted access. Please contact vacchelp@uvm.edu if you wish to see if you qualify to use this partition.
Slurm examples:¶
These examples will have comments in them explaining each line. A short blurb above the section will describe the example.
Simple single threaded example.¶
This example launches a single threaded program, that is a program whose execution would only use one core no matter how many it possibly could have had access to. In this example, we force the maximum CPUs to be one with the "ntasks" switch. Use a batch file like this one for running single-threaded applications on one core.
#!/bin/bash
#
# Simple single threaded job
#SBATCH --job-name=simple-single-threaded-job
#
# You can have it email you at the end, or if it fails:
#SBATCH --mail-type=END,FAIL
#
# Define who gets emailed: Ensure you spell your NetID correctly!
#SBATCH --mail-user=yourNetID@uvm.edu
#
# We have to define which partition we run in:
#SBATCH --partition=short
#
# Run all processes on a single node:
#SBATCH --nodes=1
#
# Run on a single CPU
#SBATCH --ntasks=1
#
# Request 1GB of memory (Kilobytes=K, Megabytes=M, Gigabytes=G, Terabytes=T)
#SBATCH --mem=1G
#
# Time limit is in hrs:min:sec (This is Wall time) Here we are requesting 5mins.
#SBATCH --time=00:05:00
#
# Standard output and error log:
#SBATCH --output=simple-single-threaded-job_%j.log
#
#
# Give some output to know we are running:
# Note, it is often useful to add these to see fail/success along the way.
echo "Running the simple single threaded job script"
# Run your code:
python ~/single-threaded.py
Simple multi-threaded example.¶
In this example, we will run a job that will run on multiple cores or threads, but is still on a single node in the cluster. This would be good for multi-core R, Python, or MATLAB jobs. You can use the SLURM_NTASKS environment variable to see how many threads or cores are available to yhour program. This is most likely what you will be doing after you confirm that your job runs and has correct output. You would then guess the number of resources you will want to allocate to your job.
#!/bin/bash
#
# Simple multi-core job
#SBATCH --job-name=simple-multi-core-job
#
# You can have it email you at the end, or if it fails:
#SBATCH --mail-type=END,FAIL
#
# You can have it email you. Ensure you spell your NetID correctly!
#SBATCH --mail-user=yourNetID@uvm.edu
#
# We have to define which partition we run in:
#SBATCH --partition=general
#
# Run all processes on a single node:
#SBATCH --nodes=1
#
# Run on a single CPU
#SBATCH --ntasks=1
#
# Request multiple CPU cores *per task*
#SBATCH --cpus-per-task=4
#
# Request 8GB of memory
#SBATCH --mem=8G
#
# Time limit is in hrs:min:sec, we are requesting 1hour:30mins
#SBATCH --time=01:30:00
#
# Standard output and error log:
#SBATCH --output=simple-single-threaded-job_%j.log
#
#
# Give some output to know we are running:
echo "Running the simple single threaded job script"
# Run your code:
~/multi-threadded-application