Docker
docker
is a containerization library that allows us to create pre-defined
setup configurations of our apps that we can then replicate across
environments.
Make sure you've read the docs on having docker up and running on your computer, as well the docker section for WSL if you're using the Windows Subsystem for Linux.
Basic concepts
Docker images are essentially a template or a recipe where you can define (in
a Dockerfile
) what exactly needs to be there for an app or service to work.
Containers are instances of said app or service, created from an image. You can create multiple containers
from the same image. For example, if you have multiple coko apps installed, you
will have multiple postgres
containers that are created from the same postgres
image.
Images are published on dockerhub. For a lot of the
most common images (eg. node
or postgres
), you can find a lot of different
images available. You generally need to decide (a) what major version of the
image you want, and (b) whether you want a more compact version of the standard
image. It is common for images to be based on a full-blown linux server distribution
like Debian or Ubuntu, but to also have images based on eg. Alpine that can be
dramatically smaller in size. Aim for using small images in production (alpine
- or
debian-slim
-based are usually good choices) unless you have a good reason not
to. In development, you can use either.
Volumes are a mechanism that you can use to persist data to or fetch data from your filesystem. Think of them like "external" hard drives that you connect to your containers.
When your containers are created, docker
will create a bridge
virtual network.
Containers that exist in the same network will be able to communicate with each
other as if they were processes on the same machine. There is a few different
drivers that docker
provides besides bridge
, which allow you to customize the
networking behaviour of your containers.
Anatomy of a Dockerfile
The simplest scenario of creating a container is to have a Dockerfile
in your
current folder and execute the docker
cli.
docker build .
The most important thing to understand about Dockerfile
s is that they work in
layers. Each line is an instruction which gets cached by docker
after the first
time it runs. If docker
sees that nothing has changed for that instruction, it will run it from the
cache, instead of rebuilding that layer. If on the other hand, it detects that
something has changed for a layer, it will rebuild this and all subsequent layers
from scratch.
Structuring your Dockerfile
s with
this in mind can have a dramatic effect on build times (by avoiding for example,
reinstalling all dependencies if none of them have changed).
Let's take a step-by-step look at a common Dockerfile
:
FROM node:12
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y ranger vim
WORKDIR /home/node/app
COPY package.json .
COPY yarn.lock .
RUN chown -R node:node .
USER node
RUN yarn install
COPY --chown=node:node . .
CMD [ "node", "startServer.js" ]
Start with a base image
FROM node:12
This is the base image you are starting with. You should generally be aware of
what you are extending. Node images are a linux distribution with specific versions
of node
, npm
and yarn
pre-installed. If you take a look at the
node page on dockerhub
you will see this means you will get the latest version of node 12 installed on
the "stretch" version of debian.
Tags that appear in the same row in dockerhub are aliases of each other. In the node example, this means that 12
, 12.22
and 12.22.1-stretch
will all fetch the exact same image.
Ommiting minor or patch versions, will default to the latest. So node
will fetch the latest node (eg. 16), node:12
will fetch the latest release of major version 12, node:12.1
will fetch the latest release with minor version 12.1.
Install OS dependencies
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y ranger vim
RUN
lets you execute a command on the linux OS that the container is made of.
The OS we've installed might need some tools to be installed in order for
the app to work, or because it is convenient. In this case, we install ranger
and vim
, as this is a development Dockerfile
and these tools are useful for
debugging. Keep dependencies to a minimum in production to reduce the image size.
Keeping docker layers in mind, it makes sense that this is at the top of our
Dockerfile
, since it is very uncommon for the OS dependencies of a project to change.
Set the working directory
WORKDIR /home/node/app
WORKDIR
simply sets our working directory. All subsequent commands will be
happening inside the /home/node/app
folder of our container.
Copy in the dependency files
COPY package.json .
COPY yarn.lock .
This is probably the most important piece in this file. COPY
brings a file from
the host OS (your computer) to the container. Remember the layers.
If we copied in all files at this stage, every time there was a change
in the layer (so any time you changed any file), all subsequent layers (eg. yarn install
)
would need to be rebuilt. In other words, you would end up reinstalling all your
node modules every time any of the files has changed.
Set the node user
RUN chown -R node:node .
USER node
Containers by default run as root. Here we set a user called node
as the owner
of all files in the current directory and switch to that user with the USER
command.
Always switch to a non-root user within your Dockerfile
. This adds an extra layer of security where even if an attacker manages to go inside a container, they cannot really do much because they don't have elevated privileges. Node containers have a ready-made node
user that saves you the hassle of of manually creating a user yourself. Switch to the node
user after apt-get
or apk
, because otherwise you won't have any privileges to install dependencies, but before you install your node modules.
Install your dependencies
RUN yarn install
Copy in the rest of the files
COPY --chown=node:node . .
Copy in the rest of your files at the end. Again according to our layers, this
means that changes in the code will not trigger unnecessary rebuilding of other
layers. So if you have already run docker build .
once, and then change a few
lines of code in your source files, rebuilding will take a split second instead
of minutes, as it will use all layers from cache and only rebuild the layer that
does COPY . .
. Note that the chown
part of the code above makes sure that the
copied files belong to the node
user.
Run the command
CMD [ "node", "server.js" ]
CMD
will set the command that should run when the container is started. Note
that this command will not run while the image is being built (unlike RUN
). Do not
use mutliple CMD
lines - if you do, only the last one will be applied.
Using the docker
cli
For the most part, it is more convenient to use docker-compose
to interact
with containers in development, as you can target containers by name (eg. client
) and automate
much of their behaviour. It helps though to understand the basics.
Build a container
docker build .
build
will look for a file named Dockerfile
by default.
Run a container
docker run -it --rm --name <my-container-name> <my-image-name> <command>
# For example, the following line will
# - create a container named "tryme" based on the node:12 image
# - execute the bash command so that it opens a command shell for you
docker run -it --rm --name tryme node:12 bash
-it
will give you an interactive terminal--rm
will remove the container when it exits--name
will give your container that name
You can also look into the following related commands:
exec
: execute a command on an already running containerstop
: stop a container by name without destroying itstart
: start a stopped container by name
You can exit containers by typing exit
or with the CTRL+D
shortcut.
List things
You can list docker related things with the ls
command. The most common (but not
the only) use cases for that will be to list containers and images.
# List all running containers
docker container ls
# List all built containers, running or stopped
docker container ls -a
# List all images on my computer
docker image ls
Remove things
You can remove images and containers with the rm
command.
# Remove the node:16 image from my computer
docker image rm node:16
# Remove the container nameed "containername"
docker container rm containername
ENTRYPOINT
& CMD
ENTRYPOINT
& CMD
are two commands that you will often encounter as steps
that are executed at container startup.
CMD
is used to set the default command
that you want to run when the container starts. It will run only if a command
argument is not provided.
Assumming we have built a container called "myApp" using the Dockerfile
we used in the examples above:
# This will run our default CMD: node startServer.js
docker run myApp
# This will use the same container, will NOT run the default CMD at all, and run: ls /home/node
docker run myApp ls /home/node
ENTRYPOINT
on the other hand is a good fit for adding instructions that you
always want to run at container startup. In the context of Coko apps, that is
often migrations and seed scripts.
For example, let's modify our Dockerfile
as follows:
# This line is new
ENTRYPOINT ["node", "scripts/setup.js"]
CMD [ "node", "startServer.js" ]
Now using the CMD
example commands:
# Both of these will run `node scripts/setup.js` before their respective command is run
docker run myApp
docker run myApp ls /home/node
ENTRYPOINT
is overridable, but overriding it will reset CMD
to an empty value.
Shell form & exec form
You might notice from the docker documentation or examples you find online that
there are two syntaxes for CMD
and ENTRYPOINT
:
# shell form
CMD node start.js
# Exec form
CMD ["node", "start.js"]
The difference here is that shell form will invoke a shell to run your command, while exec form will not.
While they may seem identical, prefer exec form unless you know what you're doing, especially for your ENTRYPOINT
command, as the way ENTRYPOINT
and CMD
interact might not be what you expect.
The .dockerignore
file
You can instruct docker
to completely ignore certain files or folders with the
.dockerignore
file. The syntax is the same as .gitignore
. The most important
part to ignore is, as expected, the node_modules
folder, as you don't want all
these files to be copied in during build time, but to be installed with the install
command inside the container. It is generally a good idea to use this file to
keep your container trimmed down by excluding anything unnecessary.
You can find a sample .dockerignore
file in the recipes section.
Multi-stage builds
You can create mutliple images in a single Dockerfile
. Those images can extend
each other or copy things from each other. A common scenario for example would
be to build the client for production and then copy the static build to an nginx
image.
# Name the image "build_client"
FROM node:12.22.1-alpine as build_client
# ...
# install dev dependencies and build the client bundle to the `_build` folder
# ...
COPY . .
RUN yarn coko-client-build
###################################################
# Start a second image
FROM nginx:1.19.9 as client
# copy the bundle from the "build_client" image into the new one
COPY --from=build_client /home/node/app/_build /usr/share/nginx/html
Setups like the above help in keeping the production images as trim as possible.
For example, in this example our final client image won't even have a node_modules
folder.
Target
You can use only parts of a multi-stage Dockerfile
by using the --target
flag
on docker build
, or with the target
key (under build
) in a docker-compose
file.
To illustrate how this works, let's use the following multi-stage Dockerfile as an example.
FROM node:12 as server
RUN echo "this is the server"
###########
FROM node:12 as another-server
RUN echo "this is the other server"
###########
FROM node:12 as client-build
WORKDIR /home/node
RUN touch test
RUN echo "try me" >> test
RUN cat test
###########
FROM nginx as client
COPY --from=client-build /home/node/test .
RUN ls
RUN cat test
If you run docker build --target server .
, the only image that will be built
will be the server
image. If you run docker build --target another-server .
,
it will run server
and another-sever
, then exit.
Without buildkit enabled, targetting an image will also build all images in the Dockerfile that are above the targetted image.
Buildkit (see tips section further down) behaves quite differently with targets. If a target is defined, it will only run the targetted image and any image it depends on. If a target is not defined it will be the equivalent of having targetted the last image in the Dockerfile.
Publishing an image to dockerhub
First off, you will need
to make sure you are logged in to dockerhub from your terminal (via the docker login
)
command.
Let's assume you have an account called "myaccount" on dockerhub.
First, you need to build and name your image:
docker build --tag myaccount/mytestimage .
You can also tag your image name (eg. with a specific version):
docker build --tag myaccount/mytestimage:1.0.0 .
If you now run docker image ls | grep myaccount
you should see your images listed.
You can now push those images to your logged in account:
docker push myaccount/mytestimage:1.0.0
Tips
Running out of space
Downloading and building all these images and containers can take a toll on your
hard drive's space, especially over a large period of time. The most common remedy
for this to prune
things that are not needed by current containers anymore.
docker system prune
It goes without saying that you should proceed with caution with this command, especially if you're not on your local computer.
You can also target specific things to prune. For example:
# This will prune only images
docker image prune
Files inside the container
Containers are meant to be ephemeral. This means that you should never count on changes that have only happened inside the container for your app to function.
A very common scenario for this problem is writing logs in a folder, or uploading
files to the filesystem inside the container. The moment this particular container
is deleted (eg. with the --rm
flag or with docker-compose down
) all files
that were created in it will be removed as well.
The best solution here is to not write files inside the container unless they are only temporary. For example, it is preferrable to use an external service like Minio or S3 to handle file uploads. If that is not possible, you can always write files to an attached volume, that will at least be persisted in the host filesystem.
Do not write log files inside containers, as they will be deleted along with the container. Simply let docker
log everything in its output. In production, it is up to the sysadmin to direct the log output to whatever
files or system they deem appropriate. This also keeps our logging setup generic and non-opinionated, as it is out of scope for the apps.
Buildkit
buildkit
is a more modern alternative tool for building images. It comes with
docker
, but it isn't enabled by default. It is recommended (but not essential)
that you enable it.
In your environment variables file / shell:
# Enable buildkit for docker
export DOCKER_BUILDKIT=1
# Enable buildkit for docker-compose (default value since docker-compose v1.28)
export COMPOSE_DOCKER_CLI_BUILD=1
Resources
Dockerfile docs
https://docs.docker.com/engine/reference/builder/
Docker cli docs
https://docs.docker.com/engine/reference/commandline/cli/
Docker volumes
https://docs.docker.com/storage/volumes/
Docker networks
https://docs.docker.com/network/
Images vs containers
https://phoenixnap.com/kb/docker-image-vs-container
How CMD and ENTRYPOINT interact
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact
Buildkit
https://brianchristner.io/what-is-docker-buildkit/
https://docs.docker.com/develop/develop-images/build_enhancements/
https://github.com/moby/buildkit