Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
stylenone

This guide provides an introduction to the SLURM job scheduler and its application on the Ganesha c2b2 cluster.

...

The multi-node script is similar to the single-node one, with the key addition of #SBATCH --ntasks-per-node=m to reserve cores and enable MPI parallel processing.

Interactive Jobs

An interactive job is a type of job that provides a command-line interface, allowing users to interact with the application or debug issues in real-time, rather than simply running a script. To submit an interactive job, use the salloc command in Slurm. Once the job begins, you'll gain access to a command-line prompt on one of the assigned compute nodes, enabling you to execute commands and utilize the allocated resources directly on that node

...

By default, jobs submitted through salloc will be allocated 1 CPU and 4GB of memory, unless specified otherwise. If your job requires more resources, you can request them using additional options with the salloc command. For instance, the following example demonstrates how to allocate 2 nodes, each with 4 CPUs and 4GB of memory.

...

GPU Jobs

GPUs will not be assigned to jobs unless explicitly requested using specific options with sbatch or srun during the resource allocation process.

Options

Explanation

--gres

Generic resources required per node

--gpus

GPUs required per job

--gpus-per-node

GPUs required per node. Equal to the --gres option for GPUs.

--gpus-per-socket

GPUs required per socket. Requires the job to specify a task socket.

--gpus-per-task

GPUs required per task. Requires the job to specify a task count. This is the recommended option for the GPU jobs.

A simple example that uses GPU and prints the GPU information is shown below. You can download

View file
namestats.cu

Code Block
#!/bin/bash

##Resource Request

#SBATCH --job-name CudaJob
#SBATCH --output result.out      ## filename of the output; the %j is equivalent to jobID; default is slurm-[jobID].out
#SBATCH --partition=gpu          ## the partitions to run in (comma seperated)
#SBATCH --ntasks=1               ## number of tasks (analyses) to run
#SBATCH --gpus-per-task=1        ## number of gpus per task
#SBATCH --mem-per-gpu=100M       ## Memory allocated for the job
#SBATCH --time=0-00:10:00        ## time for analysis (day-hour:min:sec)

##Load the CUDA module
module load cuda

##Compile the cuda script using the nvcc compiler
nvcc -o stats stats.cu

## Run the script
srun stats