Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Research Computing team has launched a dedicated, robust, large capacity scientific data storage system called Engram to support research and collaboration for Zuckerman researchers.    The system has an initial a capacity of roughly two (2) petabytes (2,000 3.6 petabytes (3,600 terabytes) and is designed to accommodate growth of tens of thousands of terabytes (TB) over the next years if required.if required.

Engram can be accessed remotely by using the CUIT Virtual Private Network (VPN) service. (More VPN setup details provided here (Limited access UNI login required))

Table of Contents

About Engram

Engram is a highly versatile scale-out network-attached storage (NAS) platform (EMC Isilon) that can provide fast access to massive amounts of unstructured data. Each of the 24 Engram storage nodes can respond to client requests and all the nodes are connected via a redundant 10 gigabit per second (Gbps) Ethernet fiber optic network. Your data on Engram is backed up to tape, and a copy is stored off-site in a safe and secure location, for disaster recovery cases.

...

We can help you find the best way to use Engram for your particular research - please contact us at rc@zi.columbia.edu.

How to Connect to Engram

Engram can be accessed remotely by using the CUIT Virtual Private Network (VPN) service. (More VPN setup details provided here (Limited access UNI login required))

Connect to Engram from Mac
Connect to Engram from Windows
Connect to Engram from Linux
Connect to Engram from Synology
Set Up a Proxy in Windows

...

Tiers of Storage and Rates

Engram has three storage levelsa single storage tier called Locker. 

  • The Engram rate is $87 per Terabyte (TB) per year.
  • Each lab

...

  • 10 TB of free locker level disk space, or
  •   7 TB of free labshare level disk space, or
  •   5 TB of free staging level disk space

...

  • receives 10 TB of free storage.
  • Billing occurs

...

  • monthly via

...

  • chart string.

Rates (in the table below) are per Terabyte (TB) per year.

...

Name

...

Purpose

...

Staging

...

High performance

...

For computing and analytics on data from multiple servers at the same time.

...

Labshare

...

General purpose

...

Good for simple/light data analytics.

...

Locker

...

Longer-term

...

For daily use and sharing files within/between labs and external collaborators.

...

  • Additional storage can be purchased in 1 TB increments.


To help you get started, please contact us at rc@zi.columbia.edu.

Data Life Cycle

You control the data stored on Engram. Your data will never be deleted. Only you can delete your data.

Accessing Your Data

Engram is network-attached storage (NAS).  It is just like a USB drive you plug into your computer - but instead of plugging the USB drive into your computer with a USB cable, Engram is connected to your server or computer using the network.

...

  • From anywhere on Columbia University downtown campus or Manhattanville
    • By connecting your computer to the network with a physical network cable (In this case we recommend you disable Wi-Fi)
    • By using "Columbia U Secure" Wi-Fi
    • Note: "Columbia University" Wi-Fi is open and insecure and cannot be used to reach Engram
  • From Remotely from outside the University by using the CUIT Virtual Private Network (VPN)( VPN Setup More VPN setup details provided here (Limited access UNI login required))
  • From CUMCCUIMC: under special circumstances we will work with CUMC CUIMC IT to open their firewall to allow access to Engram from CUMC CUIMC locations. Please contact us at rc@zi.columbia.edu

Backing Up Your Data

Data on Engram labshare and locker levels is backed up to tape. We can back up your data on staging storage level if requested.

Tapes are encrypted and periodically moved to a secure offsite location.

Tape backup is the only data protection technology that protects your data against ransomware. Replication (syncing) solutions do not protect against ransomware.

Restoring Your Data

To request data restore please open a restore request by sending email to rc@zi.columbia.edu and tell us:

  1. The lab you are from
  2. The directory from which you would like the files to be restored
  3. A date range (for example, please restore these files from September 15, 2017)

Requesting Access to Existing Storage

If your lab already has a network drive and you need access to it, please send the following information to rc@zi.columbia.edu:

  1. Your lab name
  2. Your full name and UNI
  3. Name of the network drive you need access to

Requesting New Storage

To request new storage please send the following information to rc@zi.columbia.edu:

  1. How large a network drive you need
  2. Engram Storage Level (Locker)
  3. List of UNIs that should have access to this network drive.

Storage is automatically backed up to tape.

Storage protocols.

Engram supports three different types of network drives:

  • NFS Exports are chiefly used in Unix/Linux-based server environments.  Access to these network drives is based off of the domain name or IP address of the server or workstation that they are to be plugged into.  You will need to contact research computing at rc@zi.columbia.edu if you need to mount an NFS export on a host that it is not already mounted on.  For each NFS network drive, research computing maintains a whitelist of servers and IP addresses that can mount the network drive.  NFS exports are useful if you know that you will only be using your network drive on Linux-based servers and you want to facilitate access to your network drive by server rather than by user account.
  • SMB Shares are chiefly used on Mac and Windows workstations, although they can also be attached to Linux servers.  For this reason, they are a good option if you expect to need to mount your network share both on workstations and servers.  Access to these network drives is based off of your user credentials and the groups that your UNI belongs to in Columbia's user directory.  If you need access to an SMB share, you will need to contact research computing to request that your UNI be added to the group associated with that SMB share.  SMB shares are not ideal for long-term storage in shared computing environments, since the connection is brokered via an individual user account.
  • Hybrid NFS/SMB Network Drives are a more complex arrangement that could make sense under certain circumstances.  With a hybrid approach, the same set of files and directories could be made available as an NFS export for a Unix/Linux-based shared server environment and an SMB share for Windows/Mac workstation environments.  This approach makes sense if you expect to have your storage connected to both local workstations and a remote server for long periods of time.  If you anticipate that you will only need to mount storage on servers on an ad-hoc basis, and that the majority of the time storage will be plugged into workstations primarily, it may make sense to stick with SMB shares alone.

Backing Up Your Data

Data on Engram labshare and locker levels is backed up to tape. We can back up your data on staging storage level if requested.

Tapes are encrypted and periodically moved to a secure offsite location.

Tape backup is the only data protection technology that protects your data against ransomware. Replication (syncing) solutions do not protect against ransomware.

Restoring Your Data

To request data restore please open a restore request by sending email to rc@zi.columbia.edu and tell us:

  1. The lab you are from
  2. The directory from which you would like the files to be restored
  3. A date range (for example, please restore these files from September 15, 2017)

Requesting Access to Existing Storage

If your lab already has a network drive and you need access to it, please send the following information to rc@zi.columbia.edu:

  1. Your lab name
  2. Your full name and UNI
  3. Name of the network drive you need access to

Requesting New Storage

To request new storage please send the following information to rc@zi.columbia.edu:

  1. How large a network drive you need
  2. Engram Storage Level (Staging, Labshare or Locker)
  3. List of UNIs that should have access to this network drive.

Labshare and Locker storage levels are automatically backed up to tape. If you need the Staging storage level to be backed up on tape please mention this in your service request.

...

  • .

We can help you decide on Engram size for your lab or project, please contact us at rc@zi.columbia.edu.

Requesting More Storage

To add additional storage capacity to your network drive please send the following information to rc@zi.columbia.edu:

  1. Your lab name
  2. Your full name and UNI
  3. Name of the network drive you would like more storage capacity added to
  4. How much additional drive capacity you would like to have, in 1 TB increments

Problems Connecting to Engram

If you have problems connecting to Engram, please follow these troubleshooting steps, in order:

...

  • Name of your lab
  • Your full name and UNI
  • Engram drive full path (example: locker-nfs.engram.rc.zi.columbia.edu\YourLab-locker)

What does Engram look like?

Engram is a 24-node storage cluster. All 24 nodes serve data. Data is backed up to tape. A copy of each tape is taken off-site to a secure facility.

Would you like to see what Engram looks like?  Email us at rc@zi.columbia.edu and we will give you a tour.

Set up Columbia University VPN (CUIT Service)
Change your UNI password (CUIT Service)

...