Reproducible Computational Environments Using Containers: Introduction to Docker and Singularity: Glossary

Key Points

Introducing Containers
  • Almost all software depends on other software components to function, but these components have independent evolutionary paths.

  • Projects involving many software components can rapidly run into a combinatoric explosion in the number of software version configurations available, yet only a subset of possible configurations actually works as desired.

  • Containers collect software components together and can help avoid software dependency problems.

  • Virtualisation is an old technology that container technology makes more practical.

  • Docker is just one software platform that can create containers and the resources they use.

Introducing the Docker command line
  • A toolbar icon indicates that Docker is ready to use.

  • You will typically interact with Docker using the command line.

Exploring and Running Containers
  • The docker pull command downloads Docker images from the internet.

  • The docker image command lists Docker images that are (now) on your computer.

  • The docker run command creates running containers from images and can run commands inside them.

  • When using the docker run command, a container can run a default action (if it has one), a user specified action, or a shell to be used interactively.

Finding Containers on the Docker Hub
  • The Docker Hub is an online repository of container images.

  • Many Docker Hub images are public, and may be officially endorsed.

  • Each Docker Hub page about an image provides structured information and subheadings

  • Most Docker Hub pages about images contain sections that provide examples of how to use those images.

  • Many Docker Hub images have multiple versions, indicated by tags.

  • The naming convention for Docker containers is: OWNER/CONTAINER:TAG

Cleaning Up Containers
  • The docker container command lists containers that have been created.

Creating your own container images
  • Dockerfiles specify what is within Docker images.

  • The docker build command is used to build an image from a Dockerfile

  • You can share your Docker images through the Docker Hub so that others can create Docker containers from your images.

Creating More Complex Container Images
  • You can include files from your Docker host into your Docker images by using the COPY instruction in your Dockerfile.

  • Docker allows containers to read and write files from the Docker host.

Containers used in generating this lesson
  • The generation of this lesson website can be effected using a container.

Containers in research workflows: reproducibility and granularity
  • Container images allow us to encapsulate the computation (and data) we have used in our research.

  • Using a service such as Docker Hub allows us to easily share computational work we have done.

  • Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility.

  • Tools such as Docker Compose, Docker Swarm and Kubernetes allow us to describe how multiple containers work together.

Singularity: Getting started
  • Singularity is another container platform and it is often used in cluster/HPC/research environments.

  • Singularity has a different security model to other container platforms, one of the key reasons that it is well suited to HPC and cluster environments.

  • Singularity has its own container image format (SIF).

  • The singularity command can be used to pull images from Singularity Hub and run a container from an image file.

Working with Singularity containers
  • Singularity caches downloaded images so that an image isn’t downloaded again when it is requested using the singularity pull command.

  • The singularity exec and singularity shell commands provide different options for starting containers.

  • Singularity can start a container from a Docker image which can be pulled directly from Docker Hub.

Building Singularity images
  • Singularity definition files are used to define the build process and configuration for an image.

  • Singularity’s Docker container provides a way to build images on a platform where Singularity is not installed but Docker is available.

  • Existing images from remote registries such as Docker Hub and Singularity Hub can be used as a base for creating new Singularity images.

Running MPI parallel jobs using Singularity containers
  • Singularity images containing MPI applications can be built on one platform and then run on another (e.g. an HPC cluster) if the two platforms have compatible MPI implementations.

  • When running an MPI application within a Singularity container, use the MPI executable on the host system to launch a Singularity container for each process.

  • Think about parallel application performance requirements and how where you build/run your image may affect that.

Glossary

FIXME