Differences between revisions 12 and 13
Revision 12 as of 2024-03-21 10:57:57
Size: 8277
Editor: jbjohnston
Comment:
Revision 13 as of 2024-03-21 10:58:19
Size: 8278
Editor: jbjohnston
Comment:
Deletions are marked like this. Additions are marked like this.
Line 26: Line 26:
module load miniconda module load miniconda3

Conda

Overview

Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. Unlike the Matilda HPC modules system, Conda creates custom personalized environments with sometimes incompatible packages that can coexist side-by-side and that can be activated as "environments" by the user.

To facilitate the use of Conda by users, we have installed miniconda3 as part of the modules system. Conda can be utilized by users by loading the module:

module load miniconda3

Software that is released as Conda-only distributions cannot be readily ported to the HPC modules system. In these cases users should use the miniconda3 modulefile to prepare and build their own custom environments. This document covers some of the highlights of using Conda.

There are multiple versions of miniconda available. You can see the versions available using:

module av miniconda3

------------------------------------------------------------------------------ /cm/shared/modulefiles -------------------------------------------------------------------------------
   miniconda3/current    miniconda3/4.9.2-py385 (D)    miniconda3/4.10.3-py385    miniconda3/24.1.2-py311

To load the default (D):

module load miniconda3

Or you may load any version you like by including the version number. For example:

module load miniconda3/24.1.2-py311

Newer versions may have more features, bug fixes, or additional commands or options.

If you would like to ensure you always use the latest version of Conda installed on Matilda, you should use:

module load miniconda3/current

The "current" modulefile will always point to the most recent version of Conda available on Matilda. When you use "conda init" (discussed later), this will modify your .bashrc file to use the "current" path, which will change when new versions are installed.

If you have already initialized Conda using an older version and wish to change to the latest, current version, open your .bashrc file with a text editor, and replace the version in the appropriate paths with "current". For example:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/cm/shared/apps/miniconda3/4.9.2-py385/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/cm/shared/apps/miniconda3/4.9.2-py385/etc/profile.d/conda.sh" ]; then
        . "/cm/shared/apps/miniconda3/4.9.2-py385/etc/profile.d/conda.sh"
    else
        export PATH="/cm/shared/apps/miniconda3/4.9.2-py385/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Would be changed to:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/cm/shared/apps/miniconda3/current/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/cm/shared/apps/miniconda3/current/etc/profile.d/conda.sh" ]; then
        . "/cm/shared/apps/miniconda3/current/etc/profile.d/conda.sh"
    else
        export PATH="/cm/shared/apps/miniconda3/current/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Conda environments created with older versions of miniconda should work with newer versions in most cases.

Creating a Conda Environment

Use the "conda create" command to create an environment. Environments are generally stored under the ".conda/envs" folder inside the user home directory. Conda environments can be located elsewhere by using the "--prefix=/path/to/environment" flag.

For example, to create an environment called "test_env" under the "/projects/myproject" directory we might use something like the following:

module load miniconda3
conda create --name test_env python=3.8 --prefix=/projects/myproject

The above command will create the environment "test_env" as a python 3.8 distribution in the directory "/projects/myproject".

Activating the Environment

Source Method

To activate our Conda environment, we could use the following method (which is now considered deprecated):

module load miniconda3
source activate test_env

This will place us inside the virtual environment where we can now install packages if we wish:

conda install numpy scipy matplotlib

To deactivate the environment:

source deactivate

Conda Activate Method

Alternatively, we could use the newer "conda activate" method, but this requires Conda initialization to be performed at least once:

module load miniconda3
conda init
conda activate test_env

By issuing the "conda init" command our ~/.bashrc file will be altered to initialize the conda environment. This will be loaded each time you login.

An alternative to changing the ~/.bashrc file would be to strip out the lines added by "conda init" and place them in a separate file: for example "conda.sh". Then to initialize our environment after login:

source ~/conda.sh

To deactivate the environment:

conda deactivate

Managing Available Environments

We can get a list of our available Conda environments using the following"

conda info --envs

Similarly, we can remove an environment permanently by using:

conda remove --name test_env --all

Software developers will sometimes provide a "YAML" file for creating a custom environment for the application. These files have the file extension *.yml. These can be used to create a Conda environment as follows:

conda env create -f myapp-linux.yml

YAML environment files contain information about Conda software channels to use and packages that should be installed to the environment.

Conda Channels

Community supported software channels can be imported into Conda and used to install a wide variety of packages. Examples include bioconda and conda-forge. Users can specify channels to use during environment creation:

conda create -n test_env --channel conda-forge --channel bioconda <pkgs to install>

We can follow the channel specification with a list of packages to install.

Managing Conda Environment Locations

By default, Conda will install your environments under the "~/.conda/envs" directory. Conda packages can take up quite a bit of space, and are stored under "~/.conda/pkgs". You can install a Conda environment to a different location using the "--prefix" flag:

conda env create --prefix /projects/myprojspace/someuser -f myapp-linux.yml

Please be aware however, that installing an environment in a non-default location means you will have to specify the complete path when activating the environment. For example:

conda activate /projects/myprojspace/someuser/myapp

You can alter the default location for Conda packages by creating a ".condarc" file. This has the advantage of not filling up your home directory space with packages - which is a somewhat common occurrence. A .condarc file might look something like the following:

pkgs_dirs:
  - /projects/projspace/someuser/conda/pkgs
channels:
  - bioconda
  - conda-forge
  - defaults

Place the .condarc file in the root of your home directory. If you have already used and initialized Conda, a .condarc file should already be present. Use a text editor to modify the file as desired.

More Information

This document only provides basic guidance on a few of Conda's features. For more information, you can refer to some of these additional resources:


CategoryHPC