Creating More Complex Container Images
Overview
Teaching: 30 min
Exercises: 30 minQuestions
How can I make more complex container images?
Objectives
Explain how you can include files within Docker images when you build them.
Explain how you can access files on the Docker host from your Docker containers.
In order to create and use your own containers, you may need more information than our previous example. You may want to use files from outside the container, copy those files into the container, and just generally learn a little bit about software installation. This episode will cover these. Note that the examples will get gradually more and more complex – most day-to-day use of containers can be accomplished using the first 1–2 sections on this page.
Using scripts and files from outside the container
In your shell, change to the sum folder in the docker-intro folder and look at
the files inside.
$ cd ~/Desktop/docker-intro/sum
$ ls
This folder has both a Dockerfile and a Python script called sum.py. Let’s say
we wanted to try running the script using our recently created alpine-python
container.
Running containers
What command would we use to run Python from the
alpine-pythoncontainer?
If we try running the container and Python script, what happens?
$ docker run alice/alpine-python python3 sum.py
python3: can't open file 'sum.py': [Errno 2] No such file or directory
No such file or directory
What does the error message mean? Why might the Python inside the container not be able to find or open our script?
Solution
The problem here is that the container and its filesystem is separate from our host computer’s filesystem. When the container runs, it can’t see anything outside itself, including any of the files on our computer.
In order to use Python (inside the container) and our script (outside the container, on our computer), we need to create a link between the directory on our computer and the container.
This link is called a “mount” and is what happens automatically when a USB drive or other external hard drive gets connected to a computer – you can see the contents appear as if they were on your computer.
We can create a mount between our computer and the running container by using an additional
option to docker run. We’ll also use the variable ${PWD} which will substitute
in our current working directory. The option will look like this
-v ${PWD}:/temp
What this means is – link my current directory with the container, and inside the
container, name the directory /temp
Let’s try running the command now:
$ docker run -v ${PWD}:/temp alice/alpine-python python3 sum.py
But we get the same error!
python3: can't open file 'sum.py': [Errno 2] No such file or directory
This final piece is a bit tricky – we really have to remember to put ourselves
inside the container. Where is the sum.py file? It’s in the directory that’s been
mapped to /temp – so we need to include that in the path to the script. This
command should give us what we need:
$ docker run -v ${PWD}:/temp alice/alpine-python python3 /temp/sum.py
Note that if we create any files in the /temp directory while the container is
running, these files will appear on our host filesystem in the original directory
and will stay there even when the container stops.
Other Commonly Used Docker Run Flags
Docker run has many other useful flags to alter its function. A couple that are commonly used include
-wand-u.The
--workdir/-wflag sets the working directory a.k.a. runs the command being executed inside the directory specified. For example, the following code would run thepwdcommand in a container started from the latest ubuntu image in the/home/alicedirectory and print/home/alice. If the directory doesn’t exist in the image it will create it.docker run -w /home/alice/ -i -t ubuntu pwdThe
--user/-uflag lets you specify the username you would like to run the container as. This is helpful if you’d like to write files to a mounted folder and not write them asrootbut rather your own user identity and group. A common example of the-uflag is--user $(id -u):$(id -g)which will fetch the current user’s ID and group and run the container as that user.
Exercise: Explore the script
What happens if you use the
docker runcommand above and put numbers after the script name?Solution
This script comes from the Python Wiki and is set to add all numbers that are passed to it as arguments.
Exercise: Checking the options
Our Docker command has gotten much longer! Can you go through each piece of the Docker command above the explain what it does? How would you characterize the key components of a Docker command?
Solution
Here’s a breakdown of each piece of the command above
docker run: use Docker to run a container-v ${PWD}:/temp: connect my current working directory (${PWD}) as a folder inside the container called/tempalice/alpine-python: name of the container to runpython3 /temp/sum.py: what commands to run in the containerMore generally, every Docker command will have the form:
docker [action] [docker options] [docker image] [command to run inside]
Exercise: Interactive jobs
Try using the directory mount option but run the container interactively. Can you find the folder that’s connected to your computer? What’s inside?
Solution
The docker command to run the container interactively is:
$ docker run -v ${PWD}:/temp -it alice/alpine-python shOnce inside, you should be able to navigate to the
/tempfolder and see that’s contents are the same as the files on your computer:/# cd /temp /# ls
Mounting a folder can be very useful when you want to run the software inside your container on many different input files. In other situations, you may want to save or archive an authoritative version of your data by adding it to the container permanently. That’s what we will cover next.
Including your scripts and data within a container image
Our next project will be to add our own files to a container – something you
might want to do if you’re sharing a finished analysis or just want to have
an archived copy of your entire analysis including the data. Let’s assume that we’ve finished with our sum.py
script and want to add it to the container itself.
In your shell, you should still be in the sum folder in the docker-intro folder.
$ pwd
$ /Users/yourname/Desktop/docker-intro/sum
Let’s add a new line to the Dockerfile we’ve been using so far to create a copy of sum.py.
We can do so by using the COPY keyword.
COPY sum.py /home
This line will cause Docker to copy the file from your computer into the container’s filesystem. Let’s build the container like before, but give it a different name:
$ docker build -t alice/alpine-sum .
Exercise: Did it work?
Can you remember how to run a container interactively? Try that with this one. Once inside, try running the Python script.
Solution
You can start the container interactively like so:
$ docker run -it alice/alpine-sum shYou should be able to run the python command inside the container like this:
/# python3 /home/sum.py
This COPY keyword can be used to place your own scripts or own data into a container
that you want to publish or use as a record. Note that it’s not necessarily a good idea
to put your scripts inside the container if you’re constantly changing or editing them.
Then, referencing the scripts from outside the container is a good idea, as we
did in the previous section. You also want to think carefully about size – if you
run docker image ls you’ll see the size of each image all the way on the right of
the screen. The bigger your image becomes, the harder it will be to easily download.
Copying alternatives
Another trick for getting your own files into a container is by using the
RUNkeyword and downloading the files from the internet. For example, if your code is in a GitHub repository, you could include this statement in your Dockerfile to download the latest version every time you build the container:RUN git clone https://github.com/alice/mycodeSimilarly, the
wgetcommand can be used to download any file publicly available on the internet:RUN wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.10.0/ncbi-blast-2.10.0+-x64-linux.tar.gzNote that the above
RUNexamples depend on commands (gitandwgetrespectively) that must be available within your container: Linux distributions such as Alpine may require you to install such commands before using them withinRUNstatements.
More fancy Dockerfile options (optional, for presentation or as exercises)
We can expand on the example above to make our container even more “automatic”. Here are some ideas:
Make the sum.py script run automatically
FROM alpine
COPY sum.py /home
RUN apk add --update python3 py3-pip python3-dev
# Run the sum.py script as the default command
CMD ["python3", "/home/sum.py"]
Build and test it:
$ docker build -t alice/alpine-sum:v1 .
$ docker run alice/alpine-sum:v1
You’ll notice that you can run the container without arguments just fine,
resulting in sum = 0, but this is boring. Supplying arguments however
doesn’t work:
docker run alice/alpine-sum:v1 10 11 12
results in
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:349: starting container process caused "exec:
\"10\": executable file not found in $PATH": unknown.
This is because the arguments 10 11 12 are interpreted as a
command that replaces the default command given by CMD
["python3", "/home/sum.py"] in the image.
To achieve the goal of having a command that always runs when the
container is run and can be passed the arguments given on the
command line, use the keyword ENTRYPOINT in the Dockerfile.
FROM alpine
COPY sum.py /home
RUN apk add --update python3 py3-pip python3-dev
# Run the sum.py script as the default command and
# allow people to enter arguments for it
ENTRYPOINT ["python3", "/home/sum.py"]
# Give default arguments, in case none are supplied on
# the command-line
CMD ["10", "11"]
Build and test it:
$ docker build -t alice/alpine-sum:v2 .
# Most of the time you are interested in the sum of 10 and 11:
$ docker run alice/alpine-sum:v2
# Sometimes you have more challenging calculations to do:
$ docker run alice/alpine-sum:v2 12 13 14
Overriding the ENTRYPOINT
Sometimes you don’t want to run the image’s
ENTRYPOINT. For example if you have a specialized image that does only sums, but you need an interactive shell to examine the container:$ docker run -it alice/alpine-sum:v2 /bin/shwill yield
Please supply integer argumentsYou need to override the
ENTRYPOINTstatement in the image like so:$ docker run -it --entrypoint /bin/sh alice/alpine-sum:v2
Add the sum.py script to the PATH so you can run it directly:
FROM alpine
COPY sum.py /home
# set script permissions
RUN chmod +x /home/sum.py
# add /home folder to the PATH
ENV PATH /home:$PATH
RUN apk add --update python3 py3-pip python3-dev
Build and test it:
$ docker build -t alice/alpine-sum:v3 .
$ docker run alice/alpine-sum:v3 sum.py 1 2 3 4
Key Points
Docker allows containers to read and write files from the Docker host.
You can include files from your Docker host into your Docker images by using the
COPYinstruction in yourDockerfile.