Reproducible Computational Environments Using Containers: Introduction to Docker and Singularity (day 1)

This course aims to introduce the use of containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs and for simplifying the setup of complex software dependencies across different systems. The course will mostly be based around the use of Docker containers but the material will be of use for whatever container technology you plan to, or end up, using. We will also introduce the Singularity container environment which is compatible with Docker and designed for use on multi-user systems (such as HPC resources).

After completing this session you should:

  • Have an understanding of what Docker and Singularity containers are, why they are useful and the common terminology used
  • Have a working Docker installation on your local system to allow you to use containers
  • Understand how to use existing Docker and Singularity containers for common tasks
  • Be able to build your own Docker containers by understanding both the role of a Dockerfile in building containers, and the syntax used in Dockerfiles
  • Be able to build your own Singularity containers from Singularity definition files and understand the syntax used in definition files
  • Understand how to manage Docker containers on your local system and Singularity containers on a remote HPC system
  • Appreciate issues around reproducibility in software, understand how containers can address some of these issues and what the limits to reproducibility using containers are
  • See how Singularity containers can work even when running parallel MPI programs on an HPC system

The practical work in this lesson is primarily aimed at using Docker on your own laptop, building Singularity containers on your own laptop (using a Docker container!) and using Singularity containers on a remote high performance computing (HPC) system. Beyond your laptop, software container technologies such as Docker can also be used in the cloud and on HPC systems. Some of the material in this lesson will be applicable to those environments too.

This site covers day 1 of the course (focussed on Docker). The material for day 2 (focussed on Singularity) can be found at: Introduction to Singularity

General Information

Where: This course will be taught online via Blackboard Collaborate. All attendees will be sent the joining link prior to the event.

When: 10:00-16:00 BST, 28 - 29 July 2021. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by the ARCHER2 Code of Conduct.

Accessibility: We are committed to making this workshop accessible to everybody.

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email support@archer2.ac.uk for more information.


Prerequisites

  • You should have basic familiarity with using a command shell, and the lesson text will at times request that you “open a shell window”, with an assumption that you know what this means.
    • Under Linux or macOS it is assumed that you will access a bash shell (usually the default), using your Terminal application.
    • Under Windows, Powershell and Git Bash should allow you to use the Unix instructions.
  • As an item of setup, it is assumed that you have a directory named container-playground that you are able to cd to using your command shell, and are also able to find using your computer’s graphical file browser (e.g., Finder on macOS or Windows Explorer). A simple way to achieve this is to create your container-playground directory within your computer’s Desktop folder. (See the Software Carpentry Shell lesson for more details.)
  • The lessons will sometimes request that you use a text editor to create or edit files in particular directories. It is assumed that you either have an editor that you know how to use that runs within the working directory of your shell window (e.g. nano), or that if you use a graphical editor, that you can use it to read and write files into the working directory of your shell.

Getting Started

To get started, follow the directions on the Setup page to ensure you have installed the Docker software, have registered for a Dockerhub account, have an SSH client available and have registered a user account on ARCHER2 (the HPC facility we will be using for the Singularity part of the course).


Collaborative Document

During the course, we will make use of a collaborative document known as an Etherpad. You can find the document at:

Schedule

Setup Download files required for the lesson
10:00 1. Welcome What can I expect from this course?
How will the course work and how will I get help?
How can I give feedback to improve the course?
10:15 2. Introducing Containers What are containers, and why might they be useful to me?
10:35 3. Introducing the Docker Command Line How do I know Docker is installed and running?
How do I interact with Docker?
10:45 4. Exploring and Running Containers How do I interact with a Docker container on my computer?
11:15 5. Break Break
11:30 6. Finding Containers on Docker Hub What is the Docker Hub, and why is it useful?
11:50 7. Cleaning Up Containers How do I interact with a Docker container on my computer?
How do I manage my containers and images?
12:00 8. Lunch Break
13:30 9. Creating Your Own Container Images How can I make my own Docker images?
How do I document the ‘recipe’ for a Docker image?
14:05 10. Creating More Complex Container Images How can I make more complex container images?
15:05 11. Break Break
15:20 12. Examples of Using Container Images in Practice How can I use Docker for my own work?
15:40 13. Containers in Research Workflows: Reproducibility and Granularity How can I use container images to make my research more reproducible?
How do I incorporate containers into my research workflow?
What are container orchestration tools and how can they potentially help me?
16:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.