Skip to main content

Docker

docker is a containerization library that allows us to create pre-defined setup configurations of our apps that we can then replicate across environments.

Make sure you've read the docs on having docker up and running on your computer, as well the docker section for WSL if you're using the Windows Subsystem for Linux.

Basic concepts

Docker images are essentially a template or a recipe where you can define (in a Dockerfile) what exactly needs to be there for an app or service to work. Containers are instances of said app or service, created from an image. You can create multiple containers from the same image. For example, if you have multiple coko apps installed, you will have multiple postgres containers that are created from the same postgres image.

Images are published on dockerhub. For a lot of the most common images (eg. node or postgres), you can find a lot of different images available. You generally need to decide (a) what major version of the image you want, and (b) whether you want a more compact version of the standard image. It is common for images to be based on a full-blown linux server distribution like Debian or Ubuntu, but to also have images based on eg. Alpine that can be dramatically smaller in size. Aim for using small images in production (alpine- or debian-slim-based are usually good choices) unless you have a good reason not to. In development, you can use either.

Volumes are a mechanism that you can use to persist data to or fetch data from your filesystem. Think of them like "external" hard drives that you connect to your containers.

When your containers are created, docker will create a bridge virtual network. Containers that exist in the same network will be able to communicate with each other as if they were processes on the same machine. There is a few different drivers that docker provides besides bridge, which allow you to customize the networking behaviour of your containers.

Anatomy of a Dockerfile

The simplest scenario of creating a container is to have a Dockerfile in your current folder and execute the docker cli.

docker build .

The most important thing to understand about Dockerfiles is that they work in layers. Each line is an instruction which gets cached by docker after the first time it runs. If docker sees that nothing has changed for that instruction, it will run it from the cache, instead of rebuilding that layer. If on the other hand, it detects that something has changed for a layer, it will rebuild this and all subsequent layers from scratch. Structuring your Dockerfiles with this in mind can have a dramatic effect on build times (by avoiding for example, reinstalling all dependencies if none of them have changed).

Let's take a step-by-step look at a common Dockerfile:

FROM node:12

RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y ranger vim

WORKDIR /home/node/app

COPY package.json .
COPY yarn.lock .

RUN chown -R node:node .
USER node

RUN yarn install

COPY --chown=node:node . .

CMD [ "node", "startServer.js" ]

Start with a base image

FROM node:12

This is the base image you are starting with. You should generally be aware of what you are extending. Node images are a linux distribution with specific versions of node, npm and yarn pre-installed. If you take a look at the node page on dockerhub you will see this means you will get the latest version of node 12 installed on the "stretch" version of debian.

info

Tags that appear in the same row in dockerhub are aliases of each other. In the node example, this means that 12, 12.22 and 12.22.1-stretchwill all fetch the exact same image.

info

Ommiting minor or patch versions, will default to the latest. So node will fetch the latest node (eg. 16), node:12 will fetch the latest release of major version 12, node:12.1 will fetch the latest release with minor version 12.1.

Install OS dependencies

RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y ranger vim

RUN lets you execute a command on the linux OS that the container is made of. The OS we've installed might need some tools to be installed in order for the app to work, or because it is convenient. In this case, we install ranger and vim, as this is a development Dockerfile and these tools are useful for debugging. Keep dependencies to a minimum in production to reduce the image size.

Keeping docker layers in mind, it makes sense that this is at the top of our Dockerfile, since it is very uncommon for the OS dependencies of a project to change.

Set the working directory

WORKDIR /home/node/app

WORKDIR simply sets our working directory. All subsequent commands will be happening inside the /home/node/app folder of our container.

Copy in the dependency files

COPY package.json .
COPY yarn.lock .

This is probably the most important piece in this file. COPY brings a file from the host OS (your computer) to the container. Remember the layers. If we copied in all files at this stage, every time there was a change in the layer (so any time you changed any file), all subsequent layers (eg. yarn install) would need to be rebuilt. In other words, you would end up reinstalling all your node modules every time any of the files has changed.

Set the node user

RUN chown -R node:node .
USER node

Containers by default run as root. Here we set a user called node as the owner of all files in the current directory and switch to that user with the USER command.

tip

Always switch to a non-root user within your Dockerfile. This adds an extra layer of security where even if an attacker manages to go inside a container, they cannot really do much because they don't have elevated privileges. Node containers have a ready-made node user that saves you the hassle of of manually creating a user yourself. Switch to the node user after apt-get or apk, because otherwise you won't have any privileges to install dependencies, but before you install your node modules.

Install your dependencies

RUN yarn install

Copy in the rest of the files

COPY --chown=node:node . .

Copy in the rest of your files at the end. Again according to our layers, this means that changes in the code will not trigger unnecessary rebuilding of other layers. So if you have already run docker build . once, and then change a few lines of code in your source files, rebuilding will take a split second instead of minutes, as it will use all layers from cache and only rebuild the layer that does COPY . .. Note that the chown part of the code above makes sure that the copied files belong to the node user.

Run the command

CMD [ "node", "server.js" ]

CMD will set the command that should run when the container is started. Note that this command will not run while the image is being built (unlike RUN). Do not use mutliple CMD lines - if you do, only the last one will be applied.

Using the docker cli

For the most part, it is more convenient to use docker-compose to interact with containers in development, as you can target containers by name (eg. client) and automate much of their behaviour. It helps though to understand the basics.

Build a container

docker build .

build will look for a file named Dockerfile by default.

Run a container

docker run -it --rm --name <my-container-name> <my-image-name> <command>

# For example, the following line will
# - create a container named "tryme" based on the node:12 image
# - execute the bash command so that it opens a command shell for you
docker run -it --rm --name tryme node:12 bash
  • -it will give you an interactive terminal
  • --rm will remove the container when it exits
  • --name will give your container that name

You can also look into the following related commands:

  • exec: execute a command on an already running container
  • stop: stop a container by name without destroying it
  • start: start a stopped container by name
tip

You can exit containers by typing exit or with the CTRL+D shortcut.

List things

You can list docker related things with the ls command. The most common (but not the only) use cases for that will be to list containers and images.

# List all running containers
docker container ls

# List all built containers, running or stopped
docker container ls -a

# List all images on my computer
docker image ls

Remove things

You can remove images and containers with the rm command.

# Remove the node:16 image from my computer
docker image rm node:16

# Remove the container nameed "containername"
docker container rm containername

ENTRYPOINT & CMD

ENTRYPOINT & CMD are two commands that you will often encounter as steps that are executed at container startup.

CMD is used to set the default command that you want to run when the container starts. It will run only if a command argument is not provided.

Assumming we have built a container called "myApp" using the Dockerfile we used in the examples above:

# This will run our default CMD: node startServer.js
docker run myApp

# This will use the same container, will NOT run the default CMD at all, and run: ls /home/node
docker run myApp ls /home/node

ENTRYPOINT on the other hand is a good fit for adding instructions that you always want to run at container startup. In the context of Coko apps, that is often migrations and seed scripts.

For example, let's modify our Dockerfile as follows:

# This line is new
ENTRYPOINT ["node", "scripts/setup.js"]

CMD [ "node", "startServer.js" ]

Now using the CMD example commands:

# Both of these will run `node scripts/setup.js` before their respective command is run
docker run myApp
docker run myApp ls /home/node
warning

ENTRYPOINT is overridable, but overriding it will reset CMD to an empty value.

Shell form & exec form

You might notice from the docker documentation or examples you find online that there are two syntaxes for CMD and ENTRYPOINT:

# shell form
CMD node start.js

# Exec form
CMD ["node", "start.js"]

The difference here is that shell form will invoke a shell to run your command, while exec form will not.

warning

While they may seem identical, prefer exec form unless you know what you're doing, especially for your ENTRYPOINT command, as the way ENTRYPOINT and CMD interact might not be what you expect.

The .dockerignore file

You can instruct docker to completely ignore certain files or folders with the .dockerignore file. The syntax is the same as .gitignore. The most important part to ignore is, as expected, the node_modules folder, as you don't want all these files to be copied in during build time, but to be installed with the install command inside the container. It is generally a good idea to use this file to keep your container trimmed down by excluding anything unnecessary.

You can find a sample .dockerignore file in the recipes section.

Multi-stage builds

You can create mutliple images in a single Dockerfile. Those images can extend each other or copy things from each other. A common scenario for example would be to build the client for production and then copy the static build to an nginx image.

# Name the image "build_client"
FROM node:12.22.1-alpine as build_client

# ...
# install dev dependencies and build the client bundle to the `_build` folder
# ...

COPY . .
RUN yarn coko-client-build

###################################################

# Start a second image
FROM nginx:1.19.9 as client

# copy the bundle from the "build_client" image into the new one
COPY --from=build_client /home/node/app/_build /usr/share/nginx/html

Setups like the above help in keeping the production images as trim as possible. For example, in this example our final client image won't even have a node_modules folder.

Target

You can use only parts of a multi-stage Dockerfile by using the --target flag on docker build, or with the target key (under build) in a docker-compose file.

To illustrate how this works, let's use the following multi-stage Dockerfile as an example.

FROM node:12 as server

RUN echo "this is the server"

###########

FROM node:12 as another-server

RUN echo "this is the other server"

###########

FROM node:12 as client-build

WORKDIR /home/node
RUN touch test
RUN echo "try me" >> test
RUN cat test

###########

FROM nginx as client

COPY --from=client-build /home/node/test .
RUN ls
RUN cat test

If you run docker build --target server ., the only image that will be built will be the server image. If you run docker build --target another-server ., it will run server and another-sever, then exit.

warning

Without buildkit enabled, targetting an image will also build all images in the Dockerfile that are above the targetted image.

info

Buildkit (see tips section further down) behaves quite differently with targets. If a target is defined, it will only run the targetted image and any image it depends on. If a target is not defined it will be the equivalent of having targetted the last image in the Dockerfile.

Publishing an image to dockerhub

First off, you will need to make sure you are logged in to dockerhub from your terminal (via the docker login) command.

Let's assume you have an account called "myaccount" on dockerhub.
First, you need to build and name your image:

docker build --tag myaccount/mytestimage .

You can also tag your image name (eg. with a specific version):

docker build --tag myaccount/mytestimage:1.0.0 .

If you now run docker image ls | grep myaccount you should see your images listed.

You can now push those images to your logged in account:

docker push myaccount/mytestimage:1.0.0

Tips

Running out of space

Downloading and building all these images and containers can take a toll on your hard drive's space, especially over a large period of time. The most common remedy for this to prune things that are not needed by current containers anymore.

docker system prune

It goes without saying that you should proceed with caution with this command, especially if you're not on your local computer.

You can also target specific things to prune. For example:

# This will prune only images
docker image prune

Files inside the container

Containers are meant to be ephemeral. This means that you should never count on changes that have only happened inside the container for your app to function.

A very common scenario for this problem is writing logs in a folder, or uploading files to the filesystem inside the container. The moment this particular container is deleted (eg. with the --rm flag or with docker-compose down) all files that were created in it will be removed as well.

The best solution here is to not write files inside the container unless they are only temporary. For example, it is preferrable to use an external service like Minio or S3 to handle file uploads. If that is not possible, you can always write files to an attached volume, that will at least be persisted in the host filesystem.

tip

Do not write log files inside containers, as they will be deleted along with the container. Simply let docker log everything in its output. In production, it is up to the sysadmin to direct the log output to whatever files or system they deem appropriate. This also keeps our logging setup generic and non-opinionated, as it is out of scope for the apps.

Buildkit

buildkit is a more modern alternative tool for building images. It comes with docker, but it isn't enabled by default. It is recommended (but not essential) that you enable it.

In your environment variables file / shell:

# Enable buildkit for docker
export DOCKER_BUILDKIT=1

# Enable buildkit for docker-compose (default value since docker-compose v1.28)
export COMPOSE_DOCKER_CLI_BUILD=1

Resources

Dockerfile docs
https://docs.docker.com/engine/reference/builder/

Docker cli docs
https://docs.docker.com/engine/reference/commandline/cli/

Docker volumes
https://docs.docker.com/storage/volumes/

Docker networks
https://docs.docker.com/network/

Images vs containers
https://phoenixnap.com/kb/docker-image-vs-container

How CMD and ENTRYPOINT interact
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact

Buildkit
https://brianchristner.io/what-is-docker-buildkit/
https://docs.docker.com/develop/develop-images/build_enhancements/
https://github.com/moby/buildkit