Introducing Containers
|
Almost all software depends on other software components to function, but these components have independent evolutionary paths.
Projects involving many software components can rapidly run into a combinatoric explosion in the number of software version configurations available, yet only a subset of possible configurations actually works as desired.
Containers collect software components together and can help avoid software dependency problems.
Virtualisation is an old technology that container technology makes more practical.
Docker is just one software platform that can create containers and the resources they use.
|
Introducing the Docker command line
|
|
Exploring and Running Containers
|
The docker pull command downloads Docker images from the internet.
The docker image command lists Docker images that are (now) on your computer.
The docker run command creates running containers from images and can run commands inside them.
When using the docker run command, a container can run a default action (if it has one), a user specified action, or a shell to be used interactively.
|
Finding Containers on the Docker Hub
|
The Docker Hub is an online repository of container images.
Many Docker Hub images are public, and may be officially endorsed.
Each Docker Hub page about an image provides structured information and subheadings
Most Docker Hub pages about images contain sections that provide examples of how to use those images.
Many Docker Hub images have multiple versions, indicated by tags.
The naming convention for Docker containers is: OWNER/CONTAINER:TAG
|
Cleaning Up Containers
|
|
Creating your own container images
|
Dockerfiles specify what is within Docker images.
The docker build command is used to build an image from a Dockerfile
You can share your Docker images through the Docker Hub so that others can create Docker containers from your images.
|
Creating More Complex Container Images
|
|
Containers used in generating this lesson
|
|
Containers in research workflows: reproducibility and granularity
|
Container images allow us to encapsulate the computation (and data) we have used in our research.
Using a service such as Docker Hub allows us to easily share computational work we have done.
Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility.
Tools such as Docker Compose, Docker Swarm and Kubernetes allow us to describe how multiple containers work together.
|
Singularity: Getting started
|
Singularity is another container platform and it is often used in cluster/HPC/research environments.
Singularity has a different security model to other container platforms, one of the key reasons that it is well suited to HPC and cluster environments.
Singularity has its own container image format (SIF).
The singularity command can be used to pull images from Singularity Hub and run a container from an image file.
|
Working with Singularity containers
|
Singularity caches downloaded images so that an image isn’t downloaded again when it is requested using the singularity pull command.
The singularity exec and singularity shell commands provide different options for starting containers.
Singularity can start a container from a Docker image which can be pulled directly from Docker Hub.
|
Building Singularity images
|
Singularity definition files are used to define the build process and configuration for an image.
Singularity’s Docker container provides a way to build images on a platform where Singularity is not installed but Docker is available.
Existing images from remote registries such as Docker Hub and Singularity Hub can be used as a base for creating new Singularity images.
|
Running MPI parallel jobs using Singularity containers
|
Singularity images containing MPI applications can be built on one platform and then run on another (e.g. an HPC cluster) if the two platforms have compatible MPI implementations.
When running an MPI application within a Singularity container, use the MPI executable on the host system to launch a Singularity container for each process.
Think about parallel application performance requirements and how where you build/run your image may affect that.
|