Reproducible Computational Environments Using Containers: Introduction to Docker and Singularity (day 1): Glossary

Key Points

Welcome
  • We should all understand and follow the ARCHER2 Code of Conduct to ensure this course is conducted in the best teaching environment.

  • The course will be flexible to best meet the learning needs of the attendees.

  • Feedback is an essential part of our training to allow us to continue to improve and make sure the course is as useful as possible to attendees.

Introducing Containers
  • Almost all software depends on other software components to function, but these components have independent evolutionary paths.

  • Small environments that contain only the software that is needed for a given task are easier to replicate and maintain.

  • Critical systems that cannot be upgraded, due to cost, difficulty, etc. need to be reproduced on newer systems in a maintainable and self-documented way.

  • Virtualization allows multiple environments to run on a single computer.

  • Containerization improves upon the virtualization of whole computers by allowing efficient management of the host computer’s memory and storage resources.

  • Containers are built from ‘recipes’ that define the required set of software components and the instructions necessary to build/install them within a container image.

  • Docker is just one software platform that can create containers and the resources they use.

Introducing the Docker Command Line
  • A toolbar icon indicates that Docker is ready to use (on Windows and macOS).

  • You will typically interact with Docker using the command line.

  • To learn how to run a certain Docker command, we can type the command followed by the --help flag.

Exploring and Running Containers
  • The docker pull command downloads Docker images from the internet.

  • The docker image command lists Docker images that are (now) on your computer.

  • The docker run command creates running containers from images and can run commands inside them.

  • When using the docker run command, a container can run a default action (if it has one), a user specified action, or a shell to be used interactively.

Finding Containers on Docker Hub
  • The Docker Hub is an online repository of container images.

  • Many Docker Hub images are public, and may be officially endorsed.

  • Each Docker Hub page about an image provides structured information and subheadings

  • Most Docker Hub pages about images contain sections that provide examples of how to use those images.

  • Many Docker Hub images have multiple versions, indicated by tags.

  • The naming convention for Docker containers is: OWNER/CONTAINER:TAG

Cleaning Up Containers
  • docker container has subcommands used to interact and manage containers.

  • docker image has subcommands used to interact and manage images.

  • docker ps can provide information on currently running containers.

Creating Your Own Container Images
  • Dockerfiles specify what is within Docker images.

  • The docker build command is used to build an image from a Dockerfile.

  • You can share your Docker images through the Docker Hub so that others can create Docker containers from your images.

Creating More Complex Container Images
  • Docker allows containers to read and write files from the Docker host.

  • You can include files from your Docker host into your Docker images by using the COPY instruction in your Dockerfile.

Examples of Using Container Images in Practice
  • There are many ways you might use Docker and existing container images in your research project.

Containers in Research Workflows: Reproducibility and Granularity
  • Container images allow us to encapsulate the computation (and data) we have used in our research.

  • Using a service such as Docker Hub allows us to easily share computational work we have done.

  • Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility.

  • Tools such as Docker Compose, Docker Swarm and Kubernetes allow us to describe how multiple containers work together.

Glossary

FIXME