Overview
In response to ever increasing requirements for lower-latency and higher throughput file services to support modern AI/ML workloads, we have added an all-NVMe based WekaFS tier to our storage profile. This highest performance parallel filesystem is directly attached to our new HPC cluster serving I/O over HDR200/100 InfiniBand. The existing Isilon NAS has been moved exclusively to Campus-wide SMB service for the labs and desktops.
All storage systems, lab instruments, and collaborating institutions are connected via GLOBUS, a de-facto standard for data sharing platform among all premier Universities, research Labs and National data repositories. GLOBUS provides very secure & efficient data movements between any GLOBUS-connected end-point, within a room to across the globe, from the convenience of a web browser. This allows users to plan & move their data between Weka and Isilon systems, or anywhere else, in preparation for running their HPC jobs.
What’s NEW in storage?
Storage Tiers | C2B2 Cluster | New Cluster |
---|---|---|
home | home/PI_gp/memberUNI | users/memberUNI |
data | data/PI_gp/memberUNI | groups/PI_gp/memberUNI |
archive | archive/PI_gp/memberUNI | archive/PI_gp/memberUNI |
Users (Previously home)
Exclusive storage space of 20GB assigned to each C2B2 Cluster user.
Used for small input files, source code files and executables, software builds etc.
POSIX/NFS shared
Backed up nightly.
Groups (Previously data)
Shared space of 1TB storage allocated by default to each PI_group.
General space for group software, common data, etc.
POSIX/NFS shared
Backed up weekly
Archive
Shared space for each group.
PI can purchase spinning disk storage to store their data for long term.
SMB mounted only, can be accessed via Globus
Backed up monthly to another S3 block storage device (in the same building but different floor)
NOTE:
Scratch storage tier is not offered in Weka, alternately a storage total of 100 TB NVMe (distributed 1.6 TB per compute node) scratch space is available. This is fastest storage space, local to the node, shared by all users, no quotas enforced, cleaned up upon termination of running jobs, and is free of cost.
Groups and Archive tier storage are available in capacity increments of 1TB.
Pricing details for each of the tiers will be shared once the New Cluster is launched.
For any questions/comments, please send an email to dsbit_help@cumc.columbia.edu