Reproducible computational environments using containers: Introduction to Singularity: Glossary

Key Points

Introducing Containers
  • Almost all software depends on other software components to function, but these components have independent evolutionary paths.

  • Small environments that contain only the software that is needed for a given task are easier to replicate and maintain.

  • Critical systems that cannot be upgraded, due to cost, difficulty, etc. need to be reproduced on newer systems in a maintainable and self-documented way.

  • Virtualization allows multiple environments to run on a single computer.

  • Containerization improves upon the virtualization of whole computers by allowing efficient management of the host computer’s memory and storage resources.

  • Containers are built from ‘recipes’ that define the required set of software components and the instructions necessary to build/install them within a container image.

  • Singularity and Docker are examples of software platforms that can create containers and the resources they use.

Singularity: Getting started
  • Singularity is another container platform and it is often used in cluster/HPC/research environments.

  • Singularity has a different security model to other container platforms, one of the key reasons that it is well suited to HPC and cluster environments.

  • Singularity has its own container image format (SIF).

  • The singularity command can be used to pull images from Sylabs Cloud Library and run a container from an image file.

Using Singularity containers to run commands
  • The singularity exec is an alternative to singularity run that allows you to start a container running a specific command.

  • The singularity shell command can be used to start a container and run an interactive shell within it.

Using Docker images with Singularity
  • Singularity can start a container from a Docker image which can be pulled directly from Docker Hub.

The Singularity cache
  • Singularity caches downloaded images so that an unchanged image isn’t downloaded again when it is requested using the singularity pull command.

  • You can free up space in the cache by removing all locally cached images or by specifying individual images to remove.

Files in Singularity containers
  • Your current directory and home directory are usually available by default in a container.

  • You have the same username and permissions in a container as on the host system.

  • You can specify additional host system directories to be available in the container.

Creating Your Own Container Images
  • Dockerfiles specify what is within Docker container images.

  • The docker image build command is used to build a container image from a Dockerfile.

  • You can share your Docker container images through the Docker Hub so that others can create Docker containers from your container images.

Creating More Complex Container Images
  • Docker allows containers to read and write files from the Docker host.

  • You can include files from your Docker host into your Docker container images by using the COPY instruction in your Dockerfile.

Running MPI parallel jobs using Singularity containers
  • Singularity images containing MPI applications can be built on one platform and then run on another (e.g. an HPC cluster) if the two platforms have compatible MPI implementations.

  • When running an MPI application within a Singularity container, use the MPI executable on the host system to launch a Singularity container for each process.

  • Think about parallel application performance requirements and how where you build/run your image may affect that.

Containers in Research Workflows: Reproducibility and Granularity
  • Container images allow us to encapsulate the computation (and data) we have used in our research.

  • Using online containerimage repositories allows us to easily share computational work we have done.

  • Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility.

Additional topics and next steps
  • TBC

(Optional) Using Singularity to run BLAST+
  • We can use containers to run software without having to install it

  • The commands we use are very similar to those we would use natively

  • Singularity handles a lot of complexity around data and internet access for us

Glossary

FIXME