Loading Software
Key Change with the Transition to RHEL9¶
With the change from TCL-based modules to the Lmod module system, we
are now using a hierarchal module structure. Modules become
available when all of the modules that they depend on (e.g. a
compiler) have already been loaded. That means that module available
(or the abbreviated command, module avail
) will not display packages
for which all dependencies have not yet been loaded. If you know the
package name, use module spider <name>
instead.
Lmod's hierarchical structure provides better organization and reduces
module conflicts by requiring that, for some modules, another must be
loaded before it becomes available. For example, if we need to have a
common parallel file library, HDF5, compiled for two different
compilers but the same version, there will be two modules for
hdf5
. You would use module spider hdf5
to find the version, say
1.12.1
. Then you would enter module spider hdf5/1.12.1
to
determine which compiler (or compilers) you must load to make that
particular version of the hdf5
module available. See below for more
detail.
What are Modules?¶
Modules are software packages that have been pre-installed and configured on the cluster. The Lmod system organizes these packages so that they can be loaded into your environment with a single command. This approach ensures that the right dependencies, paths, and environment variables are set for your software to run correctly. Modules also enable the VACC to provide multiple versions of the same software and to provide packages that may have conflicting configurations.
Key Lmod Commands¶
module avail
: Lists modules that can be loaded without first loading other modules.
module load <module_name>
: Loads a specific module into your environment.
module list
: Displays the modules currently loaded in your session.
module unload <module_name>
: Unloads a specific module from your environment.
module purge
: Unloads all currently loaded modules.
module spider
: Reports all modules, including those in the hierarchy.
module spider <module_name>
: Reports all versions for the modules that match <module_name>
.
module spider <module_name>/<version>
: Provides a detailed report on a module including what, if any, modules must be loaded first.
module keyword <keyword>
: Search through module "help" text and "whatis" descriptions for the given word(s).
Basic use of modules¶
The most common use of the module
command is to load a module,
which simply means configuring it for your use. To load a software
module, you need to know it's name. If you want to use CUDA, you
would use the module
command load
followed by the module name,
which in this case is also the software name (lowercase).
$ module load cuda
To see which modules are currently loaded, you use the module
command list
.
$ module list
Currently Loaded Modules:
1) cuda/12.6.2
Note that module list
shows both a name and a version number, in
this case 12.6.2
. When you module load
a package, you can also
specify the version, as in,
$ module load cuda/12.6.2
When you log out, all modules are unloaded. Therefore, when you log in again, you will need to reload any needed modules.
To reverse the load, you would enter:
$ module unload cuda
Only one version of a module can be loaded at the same time, so you do not have to give the version number when unloading. If you wish to unload all modules, you can use the single command
$ module purge
which clears all modules.
Finding installed software¶
Loading software once you know the module name is easy; sometimes
finding what software is installed presents a bit of a challenge.
However, the module
command has several ways to locate software
packages that can be helpful.
The most general is to search by keyword. This searches for words
that appear in either the module's help message or whatis details
that include a general package description, keywords, etc. For
example, some software has a keyword compiler
. An edited result
from a keyword search showing the cuda
example above would appear
like this:
$ module keyword compiler
----------------------------------------------------------------------------------
The following modules match your search criteria: "compiler"
----------------------------------------------------------------------------------
cuda: cuda/11.8.0, cuda/12.2.2, cuda/12.3.2, cuda/12.4.1, cuda/12.6.2
CUDA compilers, libraries, and profiler tools.
gcc: gcc/10.5.0, gcc/13.3.0
GNU compiler suite
oneapi: oneapi/2024.2.1
Intel compiler suite, libraries, and MPI
----------------------------------------------------------------------------------
Note that the first line lists the module name, then after the colon
it lists the exact versions for which there are modules in a form that
can be used directly with module load
. For some packages it may take
more than one line. On the line after the name and version, the
Description (when available) is displayed.
Once you know the name of the module you want, you can obtain
additional information about it using module spider
. If you compare
the output of module keyword
and module spider
there is not a lot
of difference for many modules. However, for some modules, especially
those in the hierarchy, like hdf5
, there will be greater
differences. Below, we show the output of module keyword hdf5
and
module spider hdf5
for comparison.
$ module keyword hdf5
---------------------------------------------------------------------------------------
The following modules match your search criteria: "hdf5"
---------------------------------------------------------------------------------------
blast-plus: blast-plus/2.14.1-ca7iit2
hdf5: hdf5/1.12.1, hdf5/1.14.4-3
HDF5 data libraries and utilities.
. . .
$ module spider hdf5
----------------------------------------------------------------------------------
hdf5:
----------------------------------------------------------------------------------
Description:
HDF5 data libraries and utilities.
Versions:
hdf5/1.12.1
hdf5/1.14.4-3
----------------------------------------------------------------------------------
For detailed information about a specific "hdf5" package (including
how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other
modules.
For example:
$ module spider hdf5/1.14.4-3
----------------------------------------------------------------------------------
The module spider
will tell you what you need to know to load a
module. For example,
$ module spider hdf5/1.14.4-3
----------------------------------------------------------------------------------
hdf5: hdf5/1.14.4-3
----------------------------------------------------------------------------------
Description:
HDF5 data libraries and utilities.
You will need to load all module(s) on any one of the lines below
before the "hdf5/1.14.4-3" module is available to load.
gcc/13.3.0
Help:
HDF5 is a data model, library, and file format for storing and managing
. . .
The most important information here is that another module must be
loaded before the hdf5/1.14.4-3
module is available to load.
Module Hierarchy¶
The HDF5 program is a library, meaning that it contains binary code used by other programs. The compiler used to install the library must match the compiler used to install any program that uses the library. When you load a compiler module, it adds additional modules to the list of what is available to load. To see what is available before and after loading a compiler compare
$ module avail
------------------------------- /gpfs1/sw/rh9/modules/Core --------------------------
R/4.4.1 launcher/3.7
Rgeospatial/4.4.1-2024-10-02 llama.cpp/b3933-cuda
Rtidyverse/4.4.1 macs2/2.2.9.1
. . .
with
$ module load gcc
$ module avail
------------------------- Applications compiled with GCC 13.3.0 ---------------------
fftw/3.3.10 hdf5/1.12.1 netcdf-c/4.9.2 openmpi/5.0.5
gdal/3.9.2 hdf5/1.14.4-3 (D) netcdf-cxx/4.3.1 proj/9.5.0
geos/3.13.0 itk/5.4.0 netcdf-fortran/4.6.1 sqlite/3.46.1
gsl/2.8 lapack/3.12.0 openblas/0.3.28
. . .
As you can see, a number of modules that were not there previously now appears before the 'Core' applications that were at the top of the list previously. The Core applications are still available, but they appear below the list of those compiled with GCC 13.3.0.
This is called a hierarchy, in this case a compiler hierarchy. There
is also a hierarchy for programs that use the message passing
interface (MPI), and there may eventually be others. Using module
spider
will always identify modules that are included in a hierarchy
and tell you which module must be loaded to make it available.
About module version numbers¶
If you enter module spider
to see a complete list of available
software, you might notice that a number of packages have a hyphenated
version number with the latter part of the version consisting of a mix
of 7 seemingly random letters and numbers. The following are some
examples of such packages:
bamtools: bamtools/2.5.2-twq7d2p
bcftools: bcftools/1.19-iq5mwek
bedtools2: bedtools2/2.31.1-xip5kr5
Software with this version scheme indicates packages that have been installed using the Spack package manager. A 7-character hash is appended to the version number as a means of distinguishing between multiple variants of the same software. Previously, VACC users would have had to use special Spack commands to search for and load those packages. Now, all modules are integrated and the Spack commands are no longer in use.
Most Spack-installed packages are in a compiler hierarchy as described above. You can make those available by first loading the following compiler:
$ module load gcc/13.3.0-xp3epyt
$ module avail
------------ /gpfs1/sw/spack/0.22.2/modules/linux-rhel9-x86_64/gcc/13.3.0 ------------
bamtools/2.5.2-twq7d2p fastp/0.23.4-mjw7rak picard/3.1.1-otrgwkh
bcftools/1.19-iq5mwek fastqc/0.12.1-qxseug5 samtools/1.19.2-pfmpoam
bedtools2/2.31.1-xip5kr5 hisat2/2.2.1-x7h4grf sratoolkit/3.0.0-y2rspiu
blast-plus/2.14.1-ca7iit2 htslib/1.19.1-6ivqauw star/2.7.11a-cp575va
bowtie2/2.5.2-qd4omrm minimap2/2.28-qcu5ixf trimmomatic/0.39-vdnktze
bwa/0.7.17-iqv3cxl openmpi/4.1.6-67ovor6
. . .
Other compiler versions and types are likely to be used in the future.
The module spider
command will always be the best way of making sure
you do not miss any available software.