Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Uses 1 node

  • Runs a single-process application

  • Has a maximum runtime of 100 hours

  • Is named "MyHelloBatch"

  • Sends email notifications to the user when the job starts, stops, or aborts"

    Example: job running on a single node

    Code Block
    languagebash
    #!/bin/bash
    #MyHelloBatch.slurm
    #
    #SBATCH -J test                           # Job name, any string
    #SBATCH -o job.%j.out                     # Name of stdout output file (%j=jobId)
    #SBATCH -N 1                              # Total number of nodes requested
    #SBATCH -n 8                              # Total number of cpu requested
    #SBATCH -t 01:30:00                       # Run time (hh:mm:ss) - 1.5 hours
    #SBATCH --mail-user=UNI@cumc.columbia.edu # use only Columbia address
    #SBATCH --mail-type=ALL                   # send email alert on all events
     
    module load anaconda/3.0                  # load the appropriate module(s) needed by
    python hello.py                           # you program

A submission script begins with #!/bin/bash, indicating it's a Linux bash script. Comments start with #, while #SBATCH lines specify job scheduling resources for SLURM. Note that #SBATCH directives must be placed at the top of the script, before any other commands. The script requests resources, such as:

#SBATCH -N n or #SBATCH --nodes=n : specifies the number of compute nodes (only 1 in this case)

#SBATCH -t T or #SBATCH --time=T: sets the maximum walltime (hh:mm:ss format)

#SBATCH -J “name" or #SBATCH --job-name="name": assigns a job name

#SBATCH --mail-user=<email_address>: sends email notifications

#SBATCH --mail-type=<type>: sets notification options (BEGIN, END, FAIL, REQUEUE, or ALL)

The script's final section is a standard Linux bash script, outlining job operations. By default, the job starts in the submission folder with the same environment variables as the user. In this example, the script simply runs the python hello.py.

Example 2: job running on multiple nodes

To execute an MPI application across multiple nodes, we need to modify the submission script to request additional resources and specify the MPI execution command:

Code Block
#!/bin/bash
#MyHelloBatch.slurm
#
#SBATCH -J test                           # Job name, any string
#SBATCH -o job.%j.out                     # Name of stdout output file (%j=jobId)
#SBATCH -N 2                              # Total number of nodes requested
#SBATCH --ntasks-per-node=16              # set the number of tasks (processes) per node
#SBATCH -t 01:30:00                       # Run time (hh:mm:ss) - 1.5 hours
#SBATCH -p highmem                        # Queue name. Specify gpu for the GPU node.
#SBATCH --mail-user=UNI@cumc.columbia.edu # use only Columbia address
#SBATCH --mail-type=ALL                   # send email alert on all events
 
module load 

...

openmpi4/

...

4.1.

...

1                

...

# load the appropriate module(s) needed by

...

mpirun myMPICode                          # you program

The multi-node script is similar to the single-node one, with the key addition of #SBATCH --ntasks-per-node=m to reserve cores and enable MPI parallel processing.