Cluster, Part 10: Dockerisification | Writing Dockerfiles
Hey there - welcome to 2021! I'm back with another cluster post. In double digits too! I think this is the longest series yet on my blog. Before we start, here's a list of all the posts in the series so far:
- Cluster, Part 1: Answers only lead to more questions
- Cluster, Part 2: Grand Designs
- Cluster, Part 3: Laying groundwork with Unbound as a DNS server
- Cluster, Part 4: Weaving Wormholes | Peer-to-Peer VPN with WireGuard
- Cluster, Part 5: Staying current | Automating apt updates and using apt-cacher-ng
- Cluster, Part 6: Superglue Service Discovery | Setting up Consul
- Cluster, Part 7: Wrangling... boxes? | Expanding the Hashicorp stack with Docker and Nomad
- Cluster, Part 8: The Shoulders of Giants | NFS, Nomad, Docker Registry
- Cluster, Part 9: The Border Between | Load Balancing with Fabio
We've got a pretty cool setup going so far! With Nomad for task scheduling (part 7), Consul to keep track of what's running where (part 6), and wesher keeping communications secured (part 4, although defence in depth says that we'll be returning later to shore up some stuff here) we have a solid starting point from which to work from. And it's only taken 9 blog posts to get to this point :P
In this post, we'll be putting all our hard work to use by looking at the basics of writing Dockerfiles. It's taken me quite a while to get my head around them, so I want to take a moment here to document some of the things I've learnt. A few other things that I want to talk about soon are Hashicorp Vault (it's still giving me major headaches trying to understand the Nomad integration though, so this may be a while), obtaining TLS certificates, and tying in with the own your code series by showing off the Docker image management script setup I have that I've worked into my Laminar CI instance, which makes it easy to rebuild images and all their dependants.
Anyway, Dockerfiles. First question: what? Dockerfiles are essentially a file containing a domain-specific language that defines how a Docker image can be built. They are usually named Dockerfile
. Here I use the term image
and not container
:
- Image: A Docker image that contains a bunch of files and directories that can be run
- Container: A copy of an image that is currently running on a host system.
In short: A container is a running image, and a Docker image is the bit that a container spins up from.
Second question: why? The answer is a few different reasons. Although it adds another layer of indirection and complication, it also allows us to square applications away such that we don't care about what host they run on (too much).
A great example here is would be a static file web server. In our case, this is particularly useful because Fabio - as far as I know - isn't actually capable of serving files from disk. Personally I have a fork of a rather nice dashboard I'd like to have running for my cluster too, so I found that it fits perfectly to test the waters.
Next question: how? Well, let's break the process down:
- Install Node.js
- Install the
serve
npm package
Thankfully, I've recently packaged Node.js in my apt repository (finally! It's only taken me multiple years.....). Since we might want to build lots of different Node.js based container images, it makes sense to make Node.js its own separate container. I'm also using my apt repository in other container images too which don't necessarily need Node.js, so I've opted to put my apt repository into my base image (If I haven't mentioned it already, I'm using minideb as my base image - which I build with a patch to make it support Raspbian - which is now called Raspberry Pi OS. It's confusing).
To better explain the plan, let's use a diagram:

(Above: A diagram I created. Link to editing file - don't forget this blog is licenced under CC-BY-SA.)
Docker images are always based on another Docker image. Our node-serve
Docker image we intend to create will be based on a minideb-node
Docker image (which we'll also be creating), which itself will be based on the minideb
base image. Base images are special, as they don't have a parent image. They are usually imported via a .tar.gz
image for example, but that's a story for another time (also for another time are image based on scratch
, a special image that's completely empty).
We'll then push the final node-serve
Docker image to a Docker registry. I'm running my own private Docker registry, but you can use the Docker Hub or setup your own private Docker registry.
With this in mind, let's start with a Docker image for Node.js:
ARG REPO_LOCATION
FROM ${REPO_LOCATION}minideb
RUN install_packages libatomic1 nodejs-sbrl
Let's talk about each of the above commands in turn:
ARG REPO_LOCATION
: This brings in an argument which is specified at build time. Here we want to allow the user to specify the location of a private Docker registry to pull the base (or parent) image from to begin the build process with.
FROM ${REPO_LOCATION}minideb
: This specifies the base (or parent) image to start the build with.
RUN install_packages libatomic1 nodejs-sbrl
: The RUN
command runs the specified command inside the Docker container, saving a new layer in the process (more on those later). In this case, we call the install_packages
command, which is a helper script provided by minideb to make package installation easier.
Pretty simple! This assumes that the minideb base image you're using has my apt repository setup, which make not be the case. To this end, we'd like to automatically set that up. To do this, we'll need to use an intermediate image. This took me some time too get my head around, so if you're unsure about anything, please comment below.
Let's expand on our earlier attempt at a Dockerfile:
ARG REPO_LOCATION
FROM ${REPO_LOCATION}minideb AS builder
RUN install_packages curl ca-certificates
RUN curl -o /srv/sbrl.asc https://apt.starbeamrainbowlabs.com/aptosaurus.asc
FROM ${REPO_LOCATION}minideb
COPY --from=builder /srv/sbrl.asc /etc/apt/trusted.gpg.d/sbrl-aptosaurus.asc
RUN echo "deb https://apt.starbeamrainbowlabs.com/ /" > /etc/apt/sources.list.d/sbrl.list && \
install_packages libatomic1 nodejs-sbrl;
This one is more complicated, so let's break it down. Here, we have an intermediate Docker image (which we name builder
via the AS builder
bit at the end of the 1st FROM
) in which we download and install curl
(the 1st RUN
command there), followed by a second image in which we copy the file we downloaded from the first Docker image and place it in a specific place in the second (the COPY
directive).
Docker always reads Dockerfiles from top to bottom and executes them in sequence, so it will assume that the last image created is the final one - i.e. from the last FROM
directive. Every FROM
directive starts afresh from a brand-new copy of the specified parent image.
We've also expanded the RUN
directive at the end of the file there to echo the apt sources list file out for my apt repository. We've done it like this in a single RUN
command and not 2, because every time you add another directive to a Dockerfile (except ARG
and FROM
), it creates a new layer in the resulting Docker image. Minimising the number of layers in a Docker image is important for performance, hence the obscurity here in chaining commands together. To build our new Dockerfile, save it to a new empty directory. Then, execute this:
cd path/to/directory/containing_the_dockerfile;
docker build --pull --tag "minideb-node" .
If you're using a private registry, add --build-arg "REPO_LOCATION=registry.example.com:5000/"
just before the .
there at the end of the command and prefix the tag with registry.example.com:5000/
. If you're developing a new Docker image and having trouble with the cache (Docker caches the result of directives when building images), add --no-cache
.
Then, push it to the Docker registry like so:
execute docker push "minideb-node"
Again, prefix minideb-node
there with registry.example.com:5000/
should you be using a private Docker registry.
Now, you should be able to start an interactive session inside your new Docker container:
docker run -it --rm minideb-node
As before, prefix minideb-node
there with registry.example.com/
if you're using a private Docker registry.
Now that we've got our Docker image for Node.js, we can write another Dockerfile for serve
, our static file HTTP server. Let's take a look:
ARG REPO_LOCATION
FROM ${REPO_LOCATION}minideb-node
RUN npm install --global serve && rm -rf "$(npm get cache)";
VOLUME [ "/srv" ]
USER 80:80
ENV NODE_ENV production
WORKDIR /srv
ENTRYPOINT [ "serve", "-l", "5000" ]
This looks similar to the previous Dockerfile, but with a few extra bits added on. Firstly, we use a RUN
directive to install the serve
npm package and delete the NPM cache in a single command (since we don't want the npm cache sticking around in the final Docker image).
We then use a VOLUME
declaration to tell Docker that we expect the /srv
to have a volume mounted to it. A volume here is a directory from the host system that will be mounted into the Docker container before it starts running. In this case, it's the web root that we'll be serving files from.
A USER
directive tells Docker what user and group IDs we want to run all subsequent commands as. This is important, as it's a bad idea to run Docker containers as root
.
The ENV
directive there is just to tell Node.js
it should run in production mode. Some Node.js applications have some optimisations they enable when this environment variable is set.
The WORKDIR
directive defines the current working directory for future commands. It functions like the cd
command in your terminal or command line. In this case, the serve
npm package always serves from the current working directory - hence we set the working directory here.
Finally, the ENTRYPOINT
directive tells Docker what command to execute by default. The ENTRYPOINT
can get quite involved and complex, but we're keeping it simple here and telling it to execute the serve
command (provided by the serve
npm package, which we installed globally earlier in the Dockerfile). We also specify the port number we want serve
to listen on with -l 5000
there.
That completes the Dockerfile for the serve
npm package. Build it as before, and then you should be able to run it like so:
docker run -it --rm -v /absolute/path/to/local_dir:/srv node-serve
As before, prefix node-serve
with the address of your private Docker registry if you're using one. The -v
bit above defines the Docker volume that mounts the webroot directory inside the Docker container.
Then, you should be able to find the IP address of the Docker container and enter it into your web browser to connect to the running server!
The URL should be something like this: http://IP_ADDRESS_HERE:5000/
.
If you're not running Docker on the same machine as your web browser is running on, then you'll need to do some fancy footwork to get it to display. It's at this point that I write a Nomad job file, and wire it up to Fabio my load balancer.
In the next post, we'll talk more about Fabio. We'll also look at the networking and architecture that glues the whole system together. Finally, we'll look at setting up HTTPS with Let's Encrypt and the DNS-01 challenge (which I found relatively simple - but only once I'd managed to install a new enough version of certbot
- which was a huge pain!).