Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Table of Contents
maxLevel2

In order for the scripts in these examples to work, you will need to replace <ACCOUNT> with your group's account name.

...

For  more discussion  of SBATCH commands in submission scripts, their syntax and options, see the section about submitting jobs.

Examples of some more advanced scripts follow below.

...

If --partition=ocp_gpu is omitted, the scheduler will request any gpu across the cluster by default.

Singularity 

Singularity is a software tool that brings Docker-like containers and reproducibility to scientific computing and HPC. Singularity has Docker container support and enables users to easily  run different flavors of Linux with different software stacks. These containers provide a single universal on-ramp from the laptop, to HPC, to cloud.

Users can run Singularity containers just as they run any other program on our HPC clusters. Example usage of Singularity is listed below. For additional details on how to use Singularity, please contact us or refer to the Singularity User Guide.

Downloading Pre-Built Containers

Singularity makes it easy to quickly deploy and use software stacks or new versions of software. Since Singularity has Docker support, users can simply pull existing Docker images from Docker Hub or download docker images directly from software repositories that increasingly support the Docker format. Singularity Container Library also provides a number of additional containers.


You can use the pull command to download pre-built images from an external resource into your current working directory. The docker:// uri reference can be used to pull Docker images. Pulled Docker images will be automatically converted to the Singularity container format. 

...

Here's an example of pulling the latest stable release of the Tensorflow Docker image and running it with Singularity. (Note: these pre-built versions may not be optimized for use with our CPUs.)

...

Singularity - Interactive Shell 

The shell command allows you to spawn a new shell within your container and interact with it as though it were a small virtual machine:

...

Code Block
Singularity> python
>>> import tensorflow as tf
>>> print(tf.__version__)
2.4.1
>>> exit()


When done, you may exit the Singularity interactive shell with the "exit" command.


Singularity> exit

Singularity: Executing Commands

The exec command allows you to execute a custom command within a container by specifying the image file. This is the way to invoke commands in your job submission script.

...

Singularity: Running a Batch Job

Below is an example of job submission script named submit.sh that runs Singularity. Note that you may need to specify the full path to the Singularity image you wish to run.


Code Block
#!/bin/bash
# Singularity example submit script for Slurm.
#
# Replace <ACCOUNT> with your account name before submitting.
#
#SBATCH -A <ACCOUNT>           # Set Account name
#SBATCH --job-name=tensorflow  # The job name
#SBATCH -c 1                   # Number of cores
#SBATCH -t 0-0:30              # Runtime in D-HH:MM
#SBATCH --mem-per-cpu=5gb      # Memory per cpu core

module load singularity
singularity exec tensorflow.sif python -c 'import tensorflow as tf; print(tf.__version__)'


Then submit the job to the scheduler. This example prints out the tensorflow version.

$ sbatch submit.sh

To run a similar job accessing a GPU, you would need to make the following changes. We will call this script "submit-GPU.sh" .

(NOTE: read the section about GPU/CUDA jobs for additional information about GPU pre-requirements. This is only a sample template )

...

Note that without the --nv in the singularity line, GPU access for the container will not occur.

Using MAKER in a Singularity container

MAKER is an easy-to-use genome annotation pipeline designed to be usable by small research groups with little bioinformatics experience. It has many dependencies, especially for Perl and using a container is a convenient way to have all the requirements in one place. The BioContainers website maintains a Singularity container as of Dec. 2023, version 3.01.03. Here is a sample tutorial.

...

To use Maker with OpenMPI, e.g., requesting 8 CPU ntasks (which are processes that a job executes in parallel in one or more nodes), you can use the following suggested options, which will help reduce warnings. Start with an interactive session using the salloc command and increase the requested memory as needed:

salloc --ntasks=8 --account=test --mem=50GB srun -n1 -N1 --mem-per-cpu=0 --gres=NONE --pty --preserve-env --mpi=none $SHELL
module load openmpi/gcc/64/4.1.5a1 singularity
mpirun -np 2 --mca btl '^openib' --mca orte_base_help_aggregate 0 singularity run https://depot.galaxyproject.org/singularity/maker:3.01.03--pl5262h8f1cd36_2 bash -c "export LIBDIR=/usr/local/share/RepeatMasker && maker"

Additionally samtools (used for reading/writing/editing/indexing/viewing SAM/BAM/CRAM format) is available in the container:

Singularity> samtools --version
samtools 1.7
Using htslib 1.7-2
Copyright (C) 2018 Genome Research Ltd.

Note, if you are testing maker and kill jobs/processes look out for .NFSLock files, which you will likely need to delete for subsequent runs of maker. You will need to use the -a option with ls as files that start with a dot/period are hidden from the ls command by default.

Using GATK in a Singularity container

GATK, the Genome Analysis Toolkit has several dependencies and can run inside a container. Here is a sample tutorial: 

...

-rw-r--r-- 1 rk3199 user 128 Nov 28 16:48 output.bai
-rw-r--r-- 1 rk3199 user 62571 Nov 28 16:48 output.bam

Using GeoChemFoam in a Singularity container

GeoChemFoam is open source code, based on the OpenFoam CFD toolbox developed at the Institute of GeoEnergy Engineering, Heriot-Watt University.

After converting one of the available Choose one of Docker containers to Singularity 3.x .sif, some , and use Singulairty/Apptainer to 'pull' it down into .sif format.

singularity pull docker://jcmaes/geochemfoam-5.1

Some additional steps/tweaks are needed to get all of the features working. For this tutorial we assume GeoChemFoam version 5.0. 1, and use the Test Case 01 Species transport in a Ketton Micro-CT image tutorial. You can choose your version of Anaconda Python, but note that Python 3.10 returns the following error when running the first script, createMesh.sh:

ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'

Note there are some changes in the tutorial from earlier versions of GeoChemFoam, e.g., 4.8. For this tutorial we assume the name of the Singularity container is geochemfoam-5.01_latest.sif. and  and we'll use an interactive job with srun, salloc (instead of srun., see below), to request  8 --ntasks for use with OpenMPI in the Ketton tutorial.

Note that using multiple nodes with either with -N/--nodes= or adding -c/--cpus-per-task in your SBATCH script will not work and result in an error: "An ORTE daemon has unexpectedly failed after launch and before communicating back to mpirun." Also note that earlier version of this tutorial used srun but the newer version of Slurm requires using salloc

salloc --pty -t 0-08:00  --ntasks 8  --mem=10gb

...

 -A <your-

...

account> 
module load singularity/3.7.

...

Singularity offers a few ways to set/pass environment variables, we'll use SINGULARITYENV_PREPEND_PATH as the container needs to know where the srun salloc command is.

export SINGULARITYENV_PREPEND_PATH=$PATH
singularity shell --bind /path/to/your-directory/runs:/home/gcfoam/works/GeoChemFoam-5.01/runs:rw geochemfoam-5.01_latest.sif

--bind connects bind connects whatever is to the left of the colon, in this case, a new directory called 'runs'. The right side of the colon is a path that exists in the container. rw is read/write

...

export HOME=/home/gcfoam

Copy the multiSpeciesTransportFoam directory the multiSpeciesTransportFoam directory from /home/gcfoam/works/GeoChemFoam-5.01/tutorials/transport to /path/to/your-directory/runs/, e.g., 

cp -a /home/gcfoam/works/GeoChemFoam-5.01/tutorials/transport /path/to/your-directory/runs/

...

/usr/bin/pip3 install -t /home/gcfoam/works/GeoChemFoam-5.01/runs matplotlib numpy scikit-image numpy-stl h5py

...

export PYTHONPATH=/home/gcfoam/works/GeoChemFoam-5.01/runs
export MPLCONFIGDIR=/home/gcfoam/works/GeoChemFoam-5.01/runs

Source the .bashrc file in the container:

source $HOME/works/GeoChemFoam-5.01/etc/bashrc

Run the scripts in the tutorial in your ~/runs directory.

...

To avoid seeing a couple of warnings from OpenMPI, e.g., WARNING: There was an error initializing an OpenFabrics device., you can add the following options to the mpirun command in the scripts (using vi) that use them:  --mca btl '^openib'  --mca orte_base_help_aggregate 0

...

The next two scripts have mpirun, which you can also add --mca btl '^openib'   --mca orte_base_help_aggregate 0

...

For additional details on how to use Singularity, please contact us or refer to the Singularity User Guide.

Using Couenne in a Singularity container

Couenne (Convex Over and Under ENvelopes for Nonlinear Estimation) is a branch&bound algorithm to solve Mixed-Integer Nonlinear Programming (MINLP) problems. It includes a suite of programs with several dependencies. Fortunately there is a Docker container which can be used to access these programs, e.g., bonmin, couenne, Ipopt, Cgl, and Cbc, via Singularity. You can use these sample .nl files to test with Couenne.


singularity pull docker://coinor/coin-or-optimization-suite 
singularity shell coin-or-optimization-suite_latest.sif
Singularity> couenne hs015.nl 
Couenne 0.5 -- an Open-Source solver for Mixed Integer Nonlinear Optimization
Mailing list: couenne@list.coin-or.org
Instructions: http://www.coin-or.org/Couenne

NLP0012I 
              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT 306.49998       22 0.004007
Couenne: new cutoff value 3.0649997900e+02 (0.009883 seconds)
Loaded instance "hs015.nl"
Constraints:            2
Variables:              2 (0 integer)
Auxiliaries:            8 (0 integer)

Coin0506I Presolve 29 (-1) rows, 9 (-1) columns and 64 (-2) elements
Clp0006I 0  Obj 0.25 Primal inf 473.75936 (14)
Clp0006I 13  Obj 0.31728151
Clp0000I Optimal - objective value 0.31728151
Clp0032I Optimal objective 0.3172815065 - 13 iterations time 0.002, Presolve 0.00
Clp0000I Optimal - objective value 0.31728151
Cbc0012I Integer solution of 306.49998 found by Couenne Rounding NLP after 0 iterations and 0 nodes (0.00 seconds)
NLP Heuristic: NLP0014I             2         OPT 306.49998        5 0.001228
solution found, obj. 306.5
Clp0000I Optimal - objective value 0.31728151
Optimality Based BT: 3 improved bounds
Probing: 2 improved bounds
Cbc0031I 1 added rows had average density of 2
Cbc0013I At root node, 4 cuts changed objective from 0.31728151 to 306.49998 in 1 passes
Cbc0014I Cut generator 0 (Couenne convexifier cuts) - 4 row cuts average 2.0 elements, 3 column cuts (3 active)
Cbc0001I Search completed - best objective 306.4999790004336, took 0 iterations and 0 nodes (0.00 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost

couenne: Optimal

     "Finished"

Linearization cuts added at root node:         30
Linearization cuts added in total:             30  (separation time: 2.4e-05s)
Total solve time:                        0.003242s (0.003242s in branch-and-bound)
Lower bound:                                306.5
Upper bound:                                306.5  (gap: 0.00%)
Branch-and-bound nodes:                         0
Performance of                           FBBT:        2.8e-05s,        4 runs. fix:          0 shrnk: 0.000103838 ubd:       2.75 2ubd:        0.5 infeas:          0
Performance of                           OBBT:       0.000742s,        1 runs. fix:          0 shrnk:    6.70203 ubd:          0 2ubd:          0 infeas:          0

...

Anaconda Python makes it easy to install Tensorflow, enabling your data science, machine learning, and artificial intelligence workflows.

https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/

Tensorflow 

First, load the anaconda python module.

$ module load anaconda

You may need to run "conda init bash" to initialize your conda shell.
$ conda init bash
==> For changes to take effect, close and re-open your current shell. <==

To install the current release of CPU-only TensorFlow:

$ conda create -n tf tensorflow
$ conda activate tf

...

$ python
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))


NetCDF

NetCDF (Network Common Data Form) is an interface for array-oriented data access and a library that provides an implementation of the interface. The NetCDF library also defines a machine-independent format for representing scientific data. Together, the interface, library, and format support the creation, access, and sharing of scientific data. 

To load the NetCDF Fortran Intel module:

...

(For more about using GPUs, see the GPU section of the documentation)


8. Get rid of XDG_RUNTIME_DIR environment variable

...

13. From your local system, open a second connection to Ginsburg that forwards a local port to the remote node and port. Replace UNI below with your uni.

Code Block
$ ssh -f -L 8080:10.43.4.206:8888 -N UNI@burg.rcs.columbia.edu  (This is not for Windows users. Windows users, see step 13B, below)

...

13B. Windows users generally are using PuTTY and not a native command line, so step 13 instructions, which use Port Forwarding, may be particularly hard to replicate. To accomplish Step 13 while using PuTTY, you should do this -

I.   Open PuTTY.
II.  In the "Session" category on the left side, enter the hostname or IP address of the remote server in the "Host Name (or IP address)" field. (In this case - burg.rcs.columbia.edu).



III.  Make sure the connection type is set to SSH.



IV.  In the "Connection" category, expand the "SSH" tab and select "Tunnels".
V.  In the "Source port" field, enter 8080.
VI. In the "Destination" field, enter 10.43.4.206:8888 (Remember, this is only an example IP. the one you use will be different)
VII. Make sure the "Local" radio button is selected.
VIII. Click the "Add" button to add the port forwarding rule to the list.
IX.  Now, return to the "Session" category on the left side.
X.  Optionally, enter a name for this configuration in the "Saved Sessions" field, then
XI.  Click "Save" to save these settings for future use.



XII. Click "Open" to start the SSH connection with the port forwarding configured.


14. Open a browser session on your desktop and enter the URL 'localhost:8080' (i.e. the string within the single quotes) into its search field. You should now see the notebook.

...

JAX, a Just-In-Time (JIT) compiler focused on harnessing the maximum number of FLOPs to generate optimized code while using the simplicity of pure Python. It is frequently updated with strict version requirements for minimum versions of CUDA and CUDNN. The following combination of modules and libraries will work, make sure to request a GPU node:

Code Block
module load anaconda/3-2022.05 cuda11.8/toolkit/11.8.0 cudnn8.6-cuda11.8/8.6.0.163
pip install -U  jax[cuda112]==0.4.7 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.4.7+cuda11.cudnn86-cp39-cp39-manylinux2014_x86_64.whlpythonwhl
Python 3.9.12 (main, Apr 5 2022, 06:56:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from jax.lib import xla_bridge
>>> print(xla_bridge.get_backend().platform)
gpu

Running ACOLITE, atmospheric correction algorithms for aquatic applications of various satellite missions developed at RBINS

Here is a tutorial on how to get ACOLITE to run within a Python session. Note the need for XQuartz on a Mac or a X-Windows program like Mobaxterm on Windows. Note that changing the versions of For a newer version of Python use the following:

Code Block
$ ml anaconda/3-2023.09
$ pip install -U --user "jax[cuda12]"
$ python
Python 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from jax.lib import xla_bridge
>>> print(xla_bridge.get_backend().platform)
gpu
>>> quit
Use quit() or Ctrl-D (i.e. EOF) to exit
>>> quit()
$ module list

Currently Loaded Modules:
  1) shared   2) DefaultModules   3) slurm/21.08.8   4) anaconda/3-2023.09   5) cuda12.0/toolkit/12.0.1  6) cudnn8.6-cuda11.8/8.6.0.163

 

ef2758@g097:~$ 

Best Practice for running LS-Dyna with MPI

By default, an MPI process migrates between cores as the OS manages resources and attempts to get the best load balance on the system. But because LS-DYNA is a memory intensive application, such migration can significantly degrade performance since memory access can take longer if the process is moved to a core farther from the memory it is using. To avoid this performance degradation, it is important to bind each MPI process to a core. Each MPI has its own way of binding the processes to cores, and furthermore, threaded MPP (HYBRID) employs a different strategy from pure MPP.

To bind processes to cores, include the following MPI execution line directives according to the type of MPI used.

HP-MPI, Platform MPI, and IBM Platform MPI:

-cpu_bind or -cpu_bind=rank
-cpu_bind=MAP_CPU:0,1,2,... <<<< not recommended unless user really needs to bind MPI processes to specific cores

IBM Platform MPI 9.1.4 and later:

-affcycle=numa

Intel MPI:

-genv I_MPI_PIN_DOMAIN=core

Open MPI:

--bind-to numa

Running ACOLITE, atmospheric correction algorithms for aquatic applications of various satellite missions developed at RBINS

Here is a tutorial on how to get ACOLITE to run within a Python session. Note the need for XQuartz on a Mac or a X-Windows program like Mobaxterm on Windows. Also note that changing the versions of GDAL and Anaconda Python will likely cause errors and the GUI will not open.

Code Block
git clone https://github.com/acolite/acolite
cd acolite
module load anaconda/3-2022.05 gdal/3.3.0 libRadtran/2.0.5

...