About HPC Cluster |
About HPC Cluster
The primary, general purpose compute cluster at C2B2 is now called "Ganesha.", This HPC cluster s a Linux-based (Rocky9.4) compute cluster consisting of 62 Dell Server 2 head nodes and a virtualized pool of login (submit) nodes 8 Weka nodes. The nodes fit in a dense configuration in 9 high-density racks and are cooled by dedicated rack refrigeration systems.
The clusters comprise:
20 compute nodes, each with 192 core processors and 768 GB of memory
2 nodes with 192 cores and 1.5 TB of memory
40 GPU node featuring 2 NVIDIA L40s GPU cards 192 cores processors and 768 GB memory
1 GPU node with a Superchip GH200 ARM architecture, 1 GPU, and 570 GB of memory
Each node has a 25 Gbps ethernet connection and 100 Gbps HDR InfiniBand. Additionally, a set of login nodes running on Proxmox virtualization provide a pool of virtual login nodes for user access to this and other systems.
Like our previous clusters, this cluster the system is controlled by SLURM. Storage for the cluster is provided exclusively by our Weka parallel filesystem with over 1 PB of total capacity.
If you're experiencing issues with the cluster, please reach out to dsbit-help@cumc.columbia.edu for support. To facilitate a quick and precise response, be sure to include the following in your email:
Your Columbia University ID (UNI)
Job ID numbers (if your issue is related to a specific job)
Getting Access
In order to get access to this HPC cluster, every research group needs to establish a PI Account using an MoU-SLA agreement that can be downloaded DSBIT-MOU-SLA.pdf This document provides further details about modalities, rights & responsibilities, and charges etc.
Logging In
You will need to use SSH in order to access the cluster. Windows users can use PuTTY or Cygwin or Mobax. MacOS users can use the built-in Terminal application.
Users log in to the cluster's login node at hpc.c2b2.columbia.edu :
$ ssh <UNI>@hpc.c2b2.columbia.edu |
Interactive login to Compute Node
Upon initial login to Ganesha, you'll find yourself on a login node. These nodes are meant for basic tasks like editing files or creating new directories, but not for heavy workloads. To perform more complex tasks, it's essential to transition from the login node to an interactive session on a compute node.
srun --pty -t 1:00:00 /bin/bash |