5 most common Dockerfile mistakes
Docker is great. You cannot deny it. Popularity is still growing and the internet is full of examples for every possible programming language, framework, and environment. When it is time to deploy something the first thing I do is search on Google for an example of Dockerfile.
This is fine, right? Unfortunately, most of the examples available online are insecure by design. In my first post here on HashNode I am going to explore some common pitfalls and possible solutions.
Running as root
This is probably the most underrated issue. By default, containers run as root. Ipotethically, if one gains control of the container, it can cause harm to the host.
One easy and reliable fix is to create a user inside the container and set it as both the working directory and the running user.
FROM nginx:latest RUN useradd --create-home dockeruser WORKDIR /home/dockeruser USER dockeruser
Many examples' base image use the latest tag. While it is fine for tutorials, Dockerfiles in production must always pin an image tag that is supposed to not change and break your build.
The latest tag is updated every time a new version of the container is pushed. Your build can suddenly break.
Suppose you are deploying a container with python:latest as a base image. At that time latest refers to Python 3.6. Weeks later you have to rebuild the image but it fails, dependencies are not fulfilled. Why? I haven't touched them! By now python:latest refers to Python 3.9 which, incidentally, does not support some of your dependencies.
Minimize the number of layers
Docker creates a layer for each
ADD instruction. The more layers, the slower the container.
Whenever possible, wrap multiple commands into a single layer. Remember that you can use
\ to trigger multi-line arguments
FROM nginx:latest RUN apt update && apt install -y \ git \ rsync \ && rm -rf /var/lib/apt/lists/*
Do not create huge containers
One container, one service. Docker containers are not virtual machines. If you have many services to deploy, just create many containers.
Use layer caching
When building images, Docker looks for images in the cache that can reuse. This way, no duplicate images are created and consecutive builds are faster.
But there is a catch. If you copy the source code before installing the dependencies, every time you update the code Docker will invalidate every successive instruction. In other words, you are going to install dependencies every time even though they are identical.
Fortunately, there is a quick fix. Simply make sure that layers that do not change frequently are before layers that do.
For instance, instead of copying the whole source code and then run
npm install, just copy
npm install and then copy the rest of the code. This way every change to the source will not trigger
npm install but rather the cache.
That's about it! There are many more tips and tricks to make Dockerfiles faster, maintainable, and secure. These are the 5 most common that everyone should fix ASAP.
If you like it, share and follow me for more! 😀
This is really an informative blog Francesco Tonini 👏
You covered mistakes that happen quite often.
I'd also like to share about automated static analysis tools like DeepSource that not only support analysis for popular programming languages but also has configuration as code analyzers like for Dockerfiles - to detect and fix common code quality issues. Also, let's not forget its Autofix feature that lets you fix most of the issues by automatically creating a PR with suggested fixes in a single click.