Conda

Application Version4.8.2
License3-clause BSD
Websitehttps://docs.conda.io/en/latest/

Description

conda is a user-level package and environment manager that gives you the ability to download and use software without requiring administrative privileges, similar to yum or apt-get.  Additionally, it facilitates scientific reproducibility by giving you the ability to export a list with the exact versions of software packages that you've used in an analysis, which you can then share with colleagues.  Going one step further, conda also allows you to generate an archive of the actual application binaries used in your analysis, which can then be unzipped by anyone who may want to rerun or remix your analysis.

There are a few idiosyncrasies to the particular installation of conda that is included with Cortex, which are described on this page.  We'll also go over a few common tasks you may want to perform with conda.

Configuration Details

Default Anaconda Distribution

The version of conda on Cortex VMs comes from the Miniconda distribution, which is a minimalist version of the full Anaconda distribution developed by Anaconda, Inc. (formerly known as Continuum Analytics).  This distribution is relatively lightweight and doesn't take up very much space (see here for a detailed comparison).  If you need the full Anaconda distribution, you can always run the following command, which will download the full Anaconda distribution using conda:

(base) -bash-4.2$ sudo -u conda /bin/bash -c "/usr/local/anaconda/bin/conda install -y anaconda"

Multi-User Installation and Service User

By default, conda is configured for a multi-user installation.  There is one universal base environment located at /usr/local/anaconda, which is owned and managed by a dedicated user, also named conda.  This dedicated user should be used instead of the root user to make changes to the universal base environment.  If you need to make changes to the universal base environment, you can do so by typing the following command, which runs the conda command as the dedicated user:

(base) -bash-4.2$ sudo -u conda /bin/bash -c "/usr/local/anaconda/bin/conda [PLACE YOUR CONDA SUBCOMMAND HERE]"

The use of a dedicated user to manage an application is a best practice for both security reasons (since it limits the damage an attacker can do if the conda installation is compromised in some way to conda-related damage) and for debugging purposes (see here).

Isolation of the universal base environment from user-level environments is also enforced by default.  This means that when you log in using your UNI, you can only install or remove software if you are doing so in an environment that you yourself have created using the conda create command.  This is because the conda command doesn't support multiple people installing or removing software simultaneously (see here for technical details).  This differs from yum and apt-get, which have locking mechanisms in place to ensure that multiple people don't trample over each other when adding/removing packages.

The dedicated service user also performs several common housekeeping tasks, like removing unused temporary files that were downloaded by conda and caching packages that have been installed in user-level environments globally so that a package won't need to be downloaded twice if multiple people want to use it within their environments.

Environment Storage and History

All environments are stored in your personal conda directory at /srv/conda/USERNAME (where USERNAME is your username).  Under this directory there are 3 subdirectories (note that some of these subdirectories may not exist immediately, and may only appear after certain actions, such as software installations or environment creations, have occurred):

  • envs contains the files and binaries used when you activate a conda environment.
  • pkgs contains temporary files that are downloaded when you install a package.  This directory is cleaned out regularly by the dedicated service user.
  • exports contains a Git repository with a number of text files that indicate the exact packages and package versions that you are using in your environments.  This Git repository can be used to determine what your environment looked like at a particular point in time.  However, it is more valuable if you occasionally install packages using pip instead of conda, since conda has native revision history functionality for conda packages (but not pip packages).

Version History

DateVersion
2020/07/274.8.2


Further Reading