Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

To keep up with the high data usage and demand in our environment, DSBIT is re-vamping the storage to ultra fast NVMe based Weka storage.

WekaFS is a fully distributed parallel filesystem leveraging NVMe Flash for file services. Integrated tiering seamlessly expands the namespace to and from HDD object storage, simplifying data management. WekaFS stands out with its unique architecture, overcoming legacy systems’ scaling and file-sharing limitations. Supporting POSIX, NFS, SMB, S3, and GPUDirect Storage.

...

Overview

In response to ever increasing requirements for lower-latency and higher throughput file services to support modern AI/ML workloads, we have added an all-NVMe based WekaFS tier to our storage profile. This highest performance parallel filesystem is directly attached to our new HPC cluster serving I/O over HDR200/100 InfiniBand. The existing Isilon NAS has been moved exclusively to Campus-wide SMB service for the labs and desktops.

All storage systems, lab instruments, and collaborating institutions are connected via GLOBUS, a de-facto standard for data sharing platform among all premier Universities, research Labs and National data repositories. GLOBUS provides very secure & efficient data movements between any GLOBUS-connected end-point, within a room to across the globe, from the convenience of a web browser. This allows users to plan & move their data between Weka and Isilon systems, or anywhere else, in preparation for running their HPC jobs.

What’s NEW with New C2B2 storage setup?

Storage Tiers

C2B2 Cluster

New Cluster

home

home/PI_gp/

memberUNI

userUNI

users/

memberUNI

userUNI

data

data/PI_gp/

memberUNI

userUNI

groups/PI_gp/

memberUNI

scratch

scratch/PI_gp/memberUNI

groups/PI_gp/scratch/memberUNI

userUNI

archive

archive/PI_gp/

memberUNI

userUNI

archive/PI_gp/

memberUNI

userUNI

Users (Previously home)

  • Exclusive free storage space of 20GB 50GB assigned to each C2B2 Cluster user.

  • Used for small input files, source code files and executables, software builds etc.

  • POSIX/NFS shared

  • Backed up nightly.daily (multiple times)

Groups (Previously data)

  • Shared space of 1TB storage allocated by default to each PI_group.

  • General space for group software, common data, etc.

  • POSIX/NFS shared

  • Backed up weekly

Scratch

  • Shared space under Groupsnightly.

  • Temporary Additional storage space that can be used while running jobs.

  • NFS mounted on HPC cluster only

  • Never, but you can archive your files to archive

NOTE: Files will be periodically deleted as per our autodelete policy.

Archive

  • Shared space for each requested in increments of 1TB.

Archive

  • Shared cold storage space of 1TB allocated by default for each PI_group.

  • PI can purchase spinning disk storage to store their data for long term.

  • SMB mounted only, can be accessed via Globus

  • Backed up monthly to another S3 block storage device (in the same building but different floor)

  • Additional storage can be requested in increments of 1TB.

Localscratch

Since all tiers on Weka system are equally fast, faster than traditional scratch space, there is no dedicated "Scratch" tier provided from Weka system.

  • 1.5 TB of local scratch space is provided on an NVMe on each node, mounted at /localscratch.

  • As the name indicates, this storage space is local to the node, shared by all users, no quotas enforced, cleaned up upon termination of running jobs, and is provided free of cost.

  • Users who want to utilize the above scratch space during job execution, should create /localscratch/user_uni/ upon job start, pull data from /groups global space, process the data, copy processed data back to /groups permanent space, and delete /localscratch/user_uni/ before job termination.

NOTE: Groups, Scratch

No matter how fast filesystem you have, there is VERY HIGH overhead in opening a file for read or write ops. This will certainly add up if you have large number of small files, and you will be surprised how bad it could get. We cannot emphasize enough that one should avoid creating large number of small files at all cost!!! In addition, use only plain-English characters, without spaces, without special characters, to name the files & folders.

Groups and Archive tier storage are available in capacity increments of 1TB.

Pricing details for each of the tiers will be available on DSBIT website once the New Cluster is launched.

For any questions/comments, please send an email to dsbit_help@cumc.columbia.edu