When the installation of software is not supported by the Spack software manager you will have to go through every step of the associated build process yourself.
This page provides guidance on how to build software to generate an executable from its source code when the build process is supported by the tools Make, CMake or
Building with Spack is faster and chances are the build will be optimised for the supercomputer hardware.
You should only attempt to build a software application manually if Spack does not support it, or if you have a good reason to do so, for example compiling with an uncommon option enabled. Check also the Compiling and Compiler Optimisation Levels pages for common options used for supercomputing applications.
To begin with, retrieve the source code and save it on the supercomputer.
Remember that the build process must happen on the type of compute node that the code will execute on, in order to take advantage of all the optimisations for that particular architecture.
The build of scientific software is the process of generating a functioning executable program from its source code. A build follows a standard sequence of steps:
- Environment configuration. The software you want to build and the build process itself almost always depend on libraries, tools and supporting files being present on the system. You must ensure that all required dependencies are available and discoverable through appropriate mechanisms such as environment variables.
- Compiling. This is the process of transforming the source code of the software into machine code, which is stored in object files. This task is performed by compilers such as
- Linking. Library dependencies and the object files produced by the previous steps are linked together to form an executable.
- Installation. Executables and other necessary artifacts, like shared or static libraries, are moved to the desired installation location on the
/softwarefilesystem, where they can be found for later use.
Major tools supporting the process are few and well established. Here is a list of them.
- GNU Make is the de facto standard build tool for software projects developed on and for Linux environments. It relies on a makefile, by default named
Makefile, which contains the rules (including the sequence of commands, environment variables, options) that tell Make how to generate an executable from a source code.
configurescript is often used to automate the retrieval of system and user information before the compilation and linking steps are performed. It generates a tailored
Makefilestarting from a general template and the collected information.
- CMake, not to be confused with GNU Make, is a meta-build tool that uses system-independent and compiler-independent configuration files to generate specific build scripts for a range of system-specific build tools, including GNU Make.
Recommended location for manual software builds
To keep your software organised, we recommend the following locations for your manual software installations:
The process of building software starts with obtaining and unpacking its source code into a directory, which from now on is referred to as
Identify the build process. Usually, the software package provides a detailed description of the build process. Otherwise, you should look for one of the following files indicating which tools are used for the purpose.
CMakeLists.txtfile in the
$ROOT_DIRdirectory indicates a CMake project. Go to step 3.
configure.shscript in the
$ROOT_DIRdirectory suggests that you must execute a script to configure the build process. Go to step 2.
$ROOT_DIRdirectory signals that the project's build process is handled through GNU Make. Go to step 4.
configurescripts. A build process uses a
configurescript to collect information regarding the environment (operating system, compilers, libraries, etc.) you intend to build the software in. It is able to collect most of the needed information and decide on the best configuration automatically. However, there are few options that you must usually set; for instance, the
--prefixoption is used to specify the absolute path, the path starting from the root of the filesystem, to the installation directory. There may be options that are not required but desirable in a supercomputing environment, for example, options enabling vectorised instructions. Once the script has run, typically GNU Make must be executed next (step 4).Show configure usage examples ...
To see the list of all options and arguments, execute the
configurescript with the
$ ./configure --help
As an example, the following line shows how to run a
configurescript specifying the
$ ./configure --prefix=/path/to/installation/dir
You can also set the value of the environment variables used by the
configurescript, like this:
$ ./configure VAR=VALUEList of most common compiling a linking variables
Variable Meaning Example
C compiler flags
C++ compiler flags
Fortran compiler flags
C/C++ preprocessor flags
Notes and best practices. Always check the output produced by a
configurescript. It might contain warnings that call for a modification of the configure options or the shell environment. Some
configurescripts compile a test code and execute it, to set some compilation options accordingly. This may not work in a cross-compilation environment such as the one on some Cray supercomputers. For instance, Cray XC40 login nodes do not have the Aries interconnect and so testing for MPI might fail. Another example is testing for GPU computing capability of a GPU cluster on login nodes, which might not have GPUs. If you encounter this, try running the build process on a compute node.
Building using CMake. Similarly to
configurescripts, CMake generates one or more environment-dependent build files (for Linux-based system they are
Makefilesfiles, covered in step 4) from a high-level, environment-independent definition of the build process that is contained in the
CMakeLists.txtfile. Terminal 1 shows the typical sequence of commands you should use. Once completed, move to step 4.Terminal 1. Using CMake to generate build files
$ cd $ROOT_DIR $ mkdir build $ cd build $ sg <projectcode> -c 'cmake ..'
Change the working directory of the terminal to
$ROOT_DIRand create a directory named
build(the name can vary, although the one suggested here is standard practice) within the same.
Move again the terminal, this time to the newly created folder.
builddirectory, execute the command
cmakepassing as an argument the path to the directory containing the
CMakeLists.txtfile. Typically the relative path
configurescript, you can specify options to CMake. The most common one is the
CMAKE_INSTALL_PREFIXoption that dictates where binaries will be installed (the default location being
/usr/local). The syntax for specifying an option to CMake is
-DOPTION=Value. In this case, the command would look like this:
$ cmake -DCMAKE_INSTALL_PREFIX=/path/to/installation/directory ..
Building using GNU Make. Conceptually, GNU Make executes commands to compile and link a program specified in the
Makefilefile, using a dedicated syntax that allows declaring dependencies between the building steps. To launch the build process, change the working directory of the terminal to the one containing the
$ROOT_DIR) and simply execute the
makecommand. Next, execute the
make installcommand to install the built executable or library.
$ sg <projectcode> -c 'make'
$ sg <projectcode> -c 'make install'
The install argument to
makeis called target. A target represents a subset of the
Makefile file that accomplishes a particular task in the larger context of the build process. In terminal 4, the first
makecommand executes the default target, which usually builds the software without installing it. The install target installs the binaries, that is, the produced executables or libraries.
Sometimes you must change the value of some variables defined in the
Makefilefile. Some variable names are standard across most
Makefilefiles. In particular,
FCare used to define executable names for C, C++ and Fortran compilers, respectively, whereas
FFLAGSare used for the corresponding compiling flags.
All the compiler modules in Pawsey HPC systems define the compiler variables
FC, which are then ready to use by GNU Make.
The software you have built is now located at the installation path. See the Next Steps section for what to do next in order to use it.
This example shows how to build
gromacs/2021.4 on Setonix using CMake. Although the application is available through Spack, sometimes users need a custom build with particular patches or flags.
- Login to Setonix, then move to your
/softwarefolder and download the source code of Gromacs. See Software Stack for more information on the organisation of software on Setonix.
$ cd /software/projects/<project-id>/<user-name>/manual/software
$ wget https://gitlab.com/gromacs/gromacs/-/archive/v2021.4/gromacs-v2021.4.tar.gz
- Request an interactive session on a compute node, with 64 CPU cores to enable a parallel build. Alternatively, you can write a build script and submit the job to the scheduler.
$ salloc -p work --ntasks=1 -c 64
Extract the source code from the archive, then execute the build process.Terminal 2. Building gromacs using CMake
$ tar -xf gromacs-v2021.4.tar.gz $ mkdir gromacs-v2021.4/build $ cd gromacs-v2021.4/build $ module load cray-fftw $ module load cray-mpich $ sg <projectcode> -c 'cmake -DCMAKE_INSTALL_PREFIX=$MYSOFTWARE/gromacs_manual_build -DGMX_MPI=ON ..' [ output ... ] $ sg <projectcode> -c 'make -j 64' [ output ... ] $ sg <projectcode> -c 'make install' [ output ... ]
Once you have installed your software, you may need to set some environment variables so that the operating system can find the software and its dependencies. The environment variables are typically
You may want to create a module for your software to modify your environment easily. Check Modules for more information.
- Software Carpentry's introductory GNU Make tutorial
- Official CMake tutorial