Table of Contents | ||
---|---|---|
|
In order for the scripts in these examples to work, you will need to replace <ACCOUNT> with your group's account name.
Hello World
This script will print "Hello World", sleep for 10 seconds, and then print the time and date. The output will be written to a file in your current directory.
Code Block |
---|
#!/bin/sh
#
# Simple "Hello World" submit script for Slurm.
#
# Replace <ACCOUNT> with your account name before submitting.
#
#SBATCH --account=<ACCOUNT> # The account name for the job.
#SBATCH --job-name=HelloWorld # The job name.
#SBATCH -c 1 # The number of cpu cores to use.
#SBATCH --time=1:00 # The time the job will take to run.
#SBATCH --mem-per-cpu=1gb # The memory the job will use per cpu core.
echo "Hello World"
sleep 10
date
# End of script
|
C/C++/Fortran
To submit a precompiled binary to run on Terremoto, the script will look just as it does in the Hello World example. The difference is that you will call your executable file instead of the shell commands "echo", "sleep", and "date".
C/C++/Fortran MPI
Intel Parallel Studio
Terremoto supports Intel Parallel Studio which provides a version of MPI derived from MPICH2. We encourage users to avail themselves of Intel MPI because it is faster and more modern than other versions. Also, all nodes on the cluster have Infiniband transport and that is the fabric that MPI jobs avail themselves of - which is another reason for a substantial boost of efficiency on the cluster.
To use Intel MPI, you must load the Intel module first:
Code Block |
---|
module load intel-parallel-studio/2017
mpiexec ./myprogram
|
In order to take advantage of Terremoto architecture, your program should be (re)compiled on the cluster even if you used Intel for compiling it on another cluster. It is important to compile with the compiler provided by the module mentioned above. Note that you may have to set additional environment variables in order to successfully compile your program.
These are the locations of the C and Fortran compilers for Intel Studio:
Code Block |
---|
$ module load intel-parallel-studio/2017
(...)
$ which mpiicc
/moto/opt/parallel_studio_xe_2017/compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpiicc
$ which ifort
/moto/opt/parallel_studio_xe_2017/compilers_and_libraries_2017.0.098/linux/bin/intel64/ifort
|
For programs written in C, use mpiicc in order to compile them:
Code Block |
---|
$ mpiicc -o <MPI_OUTFILE> <MPI_INFILE.c>
|
The submit script below, named pi_mpi.sh, assumes that you have compiled a simple MPI program used to compute pi, (see mpi_test.c), and created a binary called pi_mpi:
Code Block |
---|
#!/bin/sh
#SBATCH -A <ACCOUNT>
#SBATCH --time=30
#SBATCH -N 2
#SBATCH --exclusive
module load intel-parallel-studio/2017
mpiexec -bootstrap slurm ./pi_mpi
# End of script
|
The --exclusive flag will ensure that full nodes are being used in the runs (that's the reason why no memory specification is given). Each available core will give rise to another MPI thread. Without the flag, you can specify the number of tasks, or tasks per node, in order to limit the number of threads that will be created. For example, you can replace the directive containing the flag by:
Code Block |
---|
#SBATCH -N 2
#SBATCH --ntasks-per-node=4
|
- and your MPI code will run on 8 threads, with 4 on each of the 2 nodes requested.
Job Submission
Code Block |
---|
$ sbatch pi_mpi.sh
|
OpenMPI
Terremoto supports also OpenMPI from the GNU family.
To use OpenMPI, you must load the following module instead:
Code Block |
---|
module load openmpi/gcc/64
mpiexec myprogram
|
Your program must be compiled on the cluster. You can use the the module command as explained above to set your path so that the corresponding mpicc will be found. Note that you may have to set additional environment variables in order to successfully compile your program.
Code Block |
---|
$ module load openmpi/gcc/64
$ which mpicc
/moto/opt/openmpi-2.0.1/bin/mpicc
|
Compile your program using mpicc. For programs written in C:
Code Block |
---|
$ mpicc -o <MPI_OUTFILE> <MPI_INFILE.c>
|
GPU (CUDA C/C++)
The cluster includes 8 Nvidia V100 GPU servers each with 2 GPU modules per server.
To use a GPU server you must specify the --gres=gpu option in your submit request, followed by a colon and the number of GPU modules you require (with a maximum of 2 per server).
Request a v100 gpu, specify this in your submit script.
Code Block |
---|
#SBATCH --gres=gpu |
Not all applications have GPU support, but some, such as MATLAB, have built-in GPU support and can be configured to use GPUs.
To build your CUDA code and run it on the GPU modules you must first set your paths so that the Nvidia compiler can be found. Please note you must be logged into a GPU node to access these commands. To login interactively to a GPU node, run the following command, replacing <ACCOUNT> with your account.
Code Block |
---|
$ srun --pty -t 0-01:00 --gres=gpu:1 -A <ACCOUNT> /bin/bash
|
Load the cuda environment module which will add cuda to your PATH and set related environment variables. Note cuda 8.0 does not support gcc 6, so gcc 5 or earlier must be accessible in your environment when running nvcc.
Code Block |
---|
$ module load gcc/4.8.5
|
Load the cuda module.
Code Block |
---|
$ module load cuda92/toolkit
|
You then have to compile your program using nvcc:
Code Block |
---|
$ nvcc -o <EXECUTABLE_NAME> <FILE_NAME.cu>
|
You can compile hello_world.cu sample code which can be built with the following command:
Code Block |
---|
$ nvcc -o hello_world hello_world.cu
|
For non-trivial code samples, refer to Nvidia's CUDA Toolkit Documentation.
A Slurm script template, gpu.sh, that can be used to submit this job is shown below:
...
Table of Contents | ||
---|---|---|
|
In order for the scripts in these examples to work, you will need to replace <ACCOUNT> with your group's account name.
Hello World
This script will print "Hello World", sleep for 10 seconds, and then print the time and date. The output will be written to a file in your current directory.
Code Block |
---|
#!/bin/sh
#
# Simple "Hello World" submit script for Slurm.
#
# Replace <ACCOUNT> with your account name before submitting.
#
#SBATCH --account=<ACCOUNT> # The account name for the job.
#SBATCH --job-name=HelloWorld # The job name.
#SBATCH -c 1 # The number of cpu cores to use.
#SBATCH --time=1:00 # The time the job will take to run.
#SBATCH --mem-per-cpu=1gb # The memory the job will use per cpu core.
echo "Hello World"
sleep 10
date
# End of script
|
C/C++/Fortran
To submit a precompiled binary to run on Terremoto, the script will look just as it does in the Hello World example. The difference is that you will call your executable file instead of the shell commands "echo", "sleep", and "date".
C/C++/Fortran MPI
Intel Parallel Studio
Terremoto supports Intel Parallel Studio which provides a version of MPI derived from MPICH2. We encourage users to avail themselves of Intel MPI because it is faster and more modern than other versions. Also, all nodes on the cluster have Infiniband transport and that is the fabric that MPI jobs avail themselves of - which is another reason for a substantial boost of efficiency on the cluster.
To use Intel MPI, you must load the Intel module first:
Code Block |
---|
module load intel-parallel-studio/2017
mpiexec ./myprogram
|
In order to take advantage of Terremoto architecture, your program should be (re)compiled on the cluster even if you used Intel for compiling it on another cluster. It is important to compile with the compiler provided by the module mentioned above. Note that you may have to set additional environment variables in order to successfully compile your program.
These are the locations of the C and Fortran compilers for Intel Studio:
Code Block |
---|
$ module load intel-parallel-studio/2017
(...)
$ which mpiicc
/moto/opt/parallel_studio_xe_2017/compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpiicc
$ which ifort
/moto/opt/parallel_studio_xe_2017/compilers_and_libraries_2017.0.098/linux/bin/intel64/ifort
|
For programs written in C, use mpiicc in order to compile them:
Code Block |
---|
$ mpiicc -o <MPI_OUTFILE> <MPI_INFILE.c>
|
The submit script below, named pi_mpi.sh, assumes that you have compiled a simple MPI program used to compute pi, (see mpi_test.c), and created a binary called pi_mpi:
Code Block |
---|
#!/bin/sh
#SBATCH -A <ACCOUNT>
#SBATCH --time=30
#SBATCH -N 2
#SBATCH --exclusive
module load intel-parallel-studio/2017
mpiexec -bootstrap slurm ./pi_mpi
# End of script
|
The --exclusive flag will ensure that full nodes are being used in the runs (that's the reason why no memory specification is given). Each available core will give rise to another MPI thread. Without the flag, you can specify the number of tasks, or tasks per node, in order to limit the number of threads that will be created. For example, you can replace the directive containing the flag by:
Code Block |
---|
#SBATCH -N 2
#SBATCH --ntasks-per-node=4
|
- and your MPI code will run on 8 threads, with 4 on each of the 2 nodes requested.
Job Submission
Code Block |
---|
$ sbatch pi_mpi.sh
|
OpenMPI
Terremoto supports also OpenMPI from the GNU family.
To use OpenMPI, you must load the following module instead:
Code Block |
---|
module load openmpi/gcc/64
mpiexec myprogram
|
Your program must be compiled on the cluster. You can use the the module command as explained above to set your path so that the corresponding mpicc will be found. Note that you may have to set additional environment variables in order to successfully compile your program.
Code Block |
---|
$ module load openmpi/gcc/64
$ which mpicc
/moto/opt/openmpi-2.0.1/bin/mpicc
|
Compile your program using mpicc. For programs written in C:
Code Block |
---|
$ mpicc -o <MPI_OUTFILE> <MPI_INFILE.c>
|
GPU (CUDA C/C++)
The cluster includes 8 Nvidia V100 GPU servers each with 2 GPU modules per server.
To use a GPU server you must specify the --gres=gpu option in your submit request, followed by a colon and the number of GPU modules you require (with a maximum of 2 per server).
Request a v100 gpu, specify this in your submit script.
Code Block |
---|
#SBATCH --gres=gpu |
Not all applications have GPU support, but some, such as MATLAB, have built-in GPU support and can be configured to use GPUs.
To build your CUDA code and run it on the GPU modules you must first set your paths so that the Nvidia compiler can be found. Please note you must be logged into a GPU node to access these commands. To login interactively to a GPU node, run the following command, replacing <ACCOUNT> with your account.
Code Block |
---|
$ srun --pty -t 0-01:00 --gres=gpu:1 -A <ACCOUNT> /bin/bash
|
Load the cuda environment module which will add cuda to your PATH and set related environment variables. Note cuda 8.0 does not support gcc 6, so gcc 5 or earlier must be accessible in your environment when running nvcc.
Code Block |
---|
$ module load gcc/4.8.5
|
Load the cuda module.
Code Block |
---|
$ module load cuda92/toolkit
|
You then have to compile your program using nvcc:
Code Block |
---|
$ nvcc -o <EXECUTABLE_NAME> <FILE_NAME.cu>
|
You can compile hello_world.cu sample code which can be built with the following command:
Code Block |
---|
$ nvcc -o hello_world hello_world.cu
|
For non-trivial code samples, refer to Nvidia's CUDA Toolkit Documentation.
A Slurm script template, gpu.sh, that can be used to submit this job is shown below:
Code Block |
---|
#!/bin/sh
#
#SBATCH --account=<ACCOUNT> # The account name for the job.
#SBATCH --job-name=HelloWorld # The job name.
#SBATCH --gres=gpu:1 # Request 1 gpu (Up to 4 on K80s, or up to 2 on P100s are valid).
#SBATCH -c 1 # The number of cpu cores to use.
#SBATCH --time=1:00 # The time the job will take to run.
#SBATCH --mem-per-cpu=1gb # The memory the job will use per cpu core.
module load cuda92/toolkit
./hello_world
# End of script
|
Job submission
Code Block |
---|
$ sbatch gpu.sh
|
This program will print out "Hello World!" when run on a gpu server or print "Hello Hello" when no gpu module is found.
Singularity Overview
Singularity is a software tool that brings Docker-like containers and reproducibility to scientific computing and HPC. Singularity has Docker container support and enables users to easily run different flavors of Linux with different software stacks. These containers provide a single universal on-ramp from the laptop, to HPC, to cloud.
Users can run Singularity containers just as they run any other program on our HPC clusters. Example usage of Singularity is listed below. For additional details on how to use Singularity, please contact us or refer to the Singularity User Guide.
Downloading Pre-Built Containers
Singularity makes it easy to quickly deploy and use software stacks or new versions of software. Since Singularity has Docker support, users can simply pull existing Docker images from Docker Hub or download docker images directly from software repositories that increasingly support the Docker format. Singularity Container Library also provides a number of additional containers.
You can use the pull command to download pre-built images from an external resource into your current working directory. The docker:// uri reference can be used to pull Docker images. Pulled Docker images will be automatically converted to the Singularity container format.
$ singularity pull docker://godlovedc/lolcow
Running Singularity Containers
Here's an example of pulling the latest stable release of the Tensorflow Docker image and running it with Singularity. (Note: these pre-built versions may not be optimized for use with our CPUs.)
First, load the Singularity software into your environment with:
$ module load singularity
Then pull the docker image. This also converts the downloaded docker image to Singularity format and save it in your current working directory:
$ singularity pull docker://tensorflow/tensorflow
Done. Container is at: ./tensorflow.simg
Once you have download a container, you can run it interactively in a shell or in batch mode.
Singularity - Interactive Shell
The shell command allows you to spawn a new shell within your container and interact with it as though it were a small virtual machine:
$ singularity shell tensorflow.simg
Singularity: Invoking an interactive shell within container...
From within the Singularity shell, you will see the Singularity prompt and can run the downloaded software. In this example, python is launched and tensorflow is loaded.
Singularity tensorflow.simg:~> python
>>> import tensorflow as tf
>>> print(tf.__version__)
1.13.1
>>> exit()
When done, you may exit the Singularity interactive shell with the "exit" command.
Singularity tensorflow.simg:~> exit
Singularity: Executing Commands
The exec command allows you to execute a custom command within a container by specifying the image file. This is the way to invoke commands in your job submission script.
$ module load singularity
$ singularity exec tensorflow.simg [command]
For example, to run python example above using the exec command:
$ singularity exec tensorflow.simg python -c 'import tensorflow as tf; print(tf.__version__)'
Singularity: Running a Batch Job
Below is an example of job submission script named submit.sh that runs Singularity. Note that you may need to specify the full path to the Singularity image you wish to run.
Code Block |
---|
#!/bin/bash # Singularity example submit script for Slurm. # # Replace <ACCOUNT> with your account name before submitting. # #SBATCH -A <ACCOUNT> # |
...
Set |
...
Account |
...
name |
...
#SBATCH --job-name=tensorflow # The job name #SBATCH -c 1 |
...
# |
...
Number |
...
of |
...
cores |
...
|
...
#SBATCH - |
...
t 0-0:30 # |
...
Runtime |
...
in D-HH:MM #SBATCH --mem-per-cpu= |
...
4gb |
...
# |
...
Memory per cpu core |
...
module load |
...
Job submission
...
singularity
singularity exec tensorflow.simg python -c 'import tensorflow as tf; print(tf.__version__)' |
Then submit the job to the scheduler. This example prints out the tensorflow version.
$ sbatch
...
submit.sh
Example of R run
For this example, the R code below is used to generate a graph ''Rplot.pdf'' of a discrete Delta-hedging of a call. It hedges along a path and repeats over many paths. There are two R files required:
...