Francesco Tonini
Francesco's Blog

Francesco's Blog

5 most common Dockerfile mistakes

Francesco Tonini's photo
Francesco Tonini
·Feb 10, 2021·

5 min read

Docker is great. You cannot deny it. Popularity is still growing and the internet is full of examples for every possible programming language, framework, and environment. When it is time to deploy something the first thing I do is search on Google for an example of Dockerfile.

That's about it

This is fine, right? Unfortunately, most of the examples available online are insecure by design. In my first post here on HashNode I am going to explore some common pitfalls and possible solutions.

Running as root

This is probably the most underrated issue. By default, containers run as root. Ipotethically, if one gains control of the container, it can cause harm to the host.

One easy and reliable fix is to create a user inside the container and set it as both the working directory and the running user.

FROM nginx:latest
RUN useradd --create-home dockeruser
WORKDIR /home/dockeruser
USER dockeruser

Using latest

Many examples' base image use the latest tag. While it is fine for tutorials, Dockerfiles in production must always pin an image tag that is supposed to not change and break your build.

The latest tag is updated every time a new version of the container is pushed. Your build can suddenly break.

Suppose you are deploying a container with python:latest as a base image. At that time latest refers to Python 3.6. Weeks later you have to rebuild the image but it fails, dependencies are not fulfilled. Why? I haven't touched them! By now python:latest refers to Python 3.9 which, incidentally, does not support some of your dependencies.

Minimize the number of layers

Docker creates a layer for each RUN, COPY, and ADD instruction. The more layers, the slower the container. Whenever possible, wrap multiple commands into a single layer. Remember that you can use \ to trigger multi-line arguments

FROM nginx:latest
RUN apt update && apt install -y \
    git \
    rsync \
    && rm -rf /var/lib/apt/lists/*

Do not create huge containers

One container, one service. Docker containers are not virtual machines. If you have many services to deploy, just create many containers.

Use layer caching

When building images, Docker looks for images in the cache that can reuse. This way, no duplicate images are created and consecutive builds are faster.

But there is a catch. If you copy the source code before installing the dependencies, every time you update the code Docker will invalidate every successive instruction. In other words, you are going to install dependencies every time even though they are identical. Fortunately, there is a quick fix. Simply make sure that layers that do not change frequently are before layers that do. For instance, instead of copying the whole source code and then run npm install, just copy package.json, run npm install and then copy the rest of the code. This way every change to the source will not trigger npm install but rather the cache.

That's about it! There are many more tips and tricks to make Dockerfiles faster, maintainable, and secure. These are the 5 most common that everyone should fix ASAP.

If you like it, share and follow me for more! 😀

Share this