ADVANCED TOPICS - Batch File Environment Variables

Slurm environment variables are a critical aspect for users of HPC clusters to understand, as they provide useful information about job execution and allow customization of Slurm behavior. Here's a brief explanation of the categories you mentioned:

Output Environment Variables:

These are set by Slurm for each job and include details about the job's run-time environment. Some common output environment variables include:

Slurm	Notes
`SLURM_JOB_ID`	The unique identifier for the current job allocation.
`SLURM_JOB_NAME`	The name of the job.
`SLURM_JOB_NODELIST`	List of nodes allocated to the job.
`SLURM_JOB_PARTITION`	The partition the job is running on.
`SLURM_JOB_NUM_NODES`	The number of nodes allocated to the job.
`SLURM_PROCID`	The ID of the task (process) within the job.
`SLURM_NTASKS`	The total number of tasks (processes) in the job.

To see the complete list of output environment variables, you can consult the Slurm documentation or run man sbatch, man salloc, or man srun and look under the "OUTPUT ENVIRONMENT VARIABLES" section.

Input Environment Variables:

These can be set by users to specify default Slurm options for their jobs. These environment variables essentially act as default settings for job scripts and command-line options. Examples include:

Slurm	Notes
`SBATCH_ACCOUNT`	Specifies the account to charge for job execution.
`SBATCH_PARTITION`	Default partition for the job.
`SBATCH_TIME`	Sets the wall clock limit for all jobs.
`SBATCH_QOS`	Specifies the Quality of Service for the job.

Again, you can find the complete list by looking at the Slurm man pages as mentioned above.

Command Customization Variables:

Slurm allows users to customize the behavior of commands and their outputs by setting certain environment variables. For example:

Slurm Notes

SQUEUE_FORMAT

This environment variable can be set to define a custom format for squeue command output. It must be set in the environment from which squeue is invoked. For example, in bash, to display job ID, partition, name, user, state, time, and nodes, you might set it like this:

export SQUEUE_FORMAT="%.18i %.9P %.8j %.8u %.2t %.9M %.9l %.6D"

SBATCH_EXPORT Controls which environment variables are exported to the job’s environment.

Remember to export these variables in your shell configuration file (like ~/.bashrc or ~/.bash_profile) or prefix them before your Slurm command if you want them to take effect. You can use the export command in bash to set these variables for your current session or for each session by placing them in your shell configuration files.

COMMONLY USED ENVIRONMENT VARIABLES

Info	Slurm	Notes
Job name	`$SLURM_JOB_NAME`
Job ID	`$SLURM_JOB_ID`
Submit directory	`$SLURM_SUBMIT_DIR`	Slurm jobs starts from the submit directory by default.
Submit host	`$SLURM_SUBMIT_HOST`
Node list	`$SLURM_JOB_NODELIST`	The Slurm variable has a different format to the PBS one. To get a list of nodes use: `scontrol show hostnames $SLURM_JOB_NODELIST`
Job array index	`$SLURM_ARRAY_TASK_ID`
Queue name	`$SLURM_JOB_PARTITION`
Number of nodes allocated	`$SLURM_JOB_NUM_NODES` `$SLURM_NNODES`
Number of processes	`$SLURM_NTASKS`
Number of processes per node	`$SLURM_TASKS_PER_NODE`
Requested tasks per node	`$SLURM_NTASKS_PER_NODE`
Requested CPUs per task	`$SLURM_CPUS_PER_TASK`
Scheduling priority	`$SLURM_PRIO_PROCESS`
Job user	`$SLURM_JOB_USER`
Hostname	`$HOSTNAME == $SLURM_SUBMIT_HOST`	Unless a shell is invoked on an allocated resource, the HOSTNAME variable is propagated (copied) from the submit machine environments will be the same on all allocated nodes.

GPUs

Jobs can request GPUs with the job submission options --partition=gpu and a count option from the table below. All counts can be represented by gputype:number or just a number (default type will be used). Available GPU types can be found with the command sinfo -O gres -p <partition>. GPUs can be requested in both Batch and Interactive jobs.

Description	Slurm directive (#SBATCH or srun option)	Example
GPUs per node	`--gpus-per-node=<gputype:number>`	`--gpus-per-node=2` or `--gpus-per-node=v100:2`
GPUs per job	`--gpus=<gputype:number>`	`--gpus=2` or `--gpus=v100:2`
GPUs per socket	`--gpus-per-socket=<gputype:number>`	`--gpus-per-socket=2` or `--gpus-per-socket=v100:2`
GPUs per task	`--gpus-per-task=<gputype:number>`	`--gpus-per-task=2` or `--gpus-per-task=v100:2`
CPUs required per GPU	`--cpus-per-gpu=<number>`	`--cpus-per-gpu=4`
Memory per GPU	`--mem-per-gpu=<number>`	`--mem-per-gpu=1000m`