Introduction

In the previous part, we explored the basics of Docker and the benefits of containerization. In this follow-up, we'll shift our focus to Dockerfiles - the essential tool for building and configuring images according to our needs.

What is a Dockerfile

Dockerfile (or .Dockerfile) is a text file used to define the configuration and setup of a Docker container. It contains instructions that:

  • specify the base image
  • install software dependencies
  • set environment variables
  • copy files into the container

To run the Dockerfile, simply execute the following command from the directory containing the file:

docker build -t image_name .

Structure

Dockerfile does not follow any of the common formats you might be familiar with, such as YAML, XML, or JSON. Instead, it has its own distinct structure as follows:

FROM node:latest
ENV NODE_ENV=production
WORKDIR /app
COPY *.json .
RUN npm install --production
COPY . .
CMD ["node", "server.js"]

As you can see, it's rather straightforward. Regardless, let's break it down instruction by instruction.

Note: Please do note that these are not by any means all of the available instructions. These are simply the most basic ones required to run a container from A to Z. For a full list of commands, refer to docs.docker.com.

FROM

In the most basic scenario, this instruction allows us to define the base image from which our image will be built. Think of it as telling Docker where to get the template from.

In this scenario, it follows the <image>:<tag> format, where image has to match an image hosted on the repository, and tag (optional) has to match an appropriate version tag.

Besides that, it can also be used in multi-stage builds to define follow-up steps, but you'll learn about that further into this article.

Note: The aforementioned repository is hub.docker.com by default unless you've decided to host one yourself.

ENV

The ENV instruction allows us to set environment variables according to the following format:

ENV key=value

The value will then be set for all subsequent instructions and will persist when the container is run from the resulting image.

Note: Alternatively, you can use the ARG instruction, but the values passed with ARG are not preserved in the final image. They are only used during the image build process.

WORKDIR

This instruction sets the working directory for any WORKDIR, RUN, CMD, ENTRYPOINT, COPY, or ADD instructions that follow it.

This is primarily used to simplify subsequent instructions. Consider the following examples:

Before:

FROM ubuntu:latest
RUN mkdir /host/apps/service
COPY . /host/apps/service
CMD ["python3", "/host/apps/service/app.py"]

After:

FROM ubuntu:latest
WORKDIR /host/apps/service
COPY . .
CMD ["python3", "app.py"]

Note: If the path defined by WORKDIR doesn't exist, it'll be created regardless of whether it's used in subsequent instructions or not.

COPY

The COPY instruction copies files and directories from the host machine into the image during the build process. It follows the format:

COPY <source> <destination>

It's commonly used to include application code, configuration files, and other necessary resources into the image so that the container can access and use them during runtime.

RUN

The RUN instruction allows us to execute commands during the image build process. It enables running any valid shell command, installing dependencies, configuring the environment, and performing other actions required to set up the application inside the image.

The format of RUN is as follows:

RUN <command>

You can use RUN multiple times in a Dockerfile to execute different commands and build up the image layer by layer. Each execution of the RUN command creates a new layer in the image - and as you'll learn later, it is not always what you want.

The changes made by RUN commands are captured in the image and persist when containers are created from that image.

CMD

The CMD instruction is used to set the default command or executable that will run when a container is started from the image. It allows you to define the primary process that runs inside the container.

The format of the instruction is as follows:

CMD ["executable", "param1", "param2", ...]

For example:

CMD ["python", "app.py"]

Note: You can use this instruction only once in a Dockerfile. If more than one instance of this instruction is found, the last one takes effect.

Note: Alternatively, one can use ENTRYPOINT, which is less flexible - this might be a preferred approach in some cases.

Best Practices

Enforcing best practices and conventions often leads to compromising flexibility. This holds true in this context, but the advantages are worth the time investment.

In this section, you'll learn about the common approaches to designing Dockerfiles and the advantages that come with them.

Use Official Images

Using official Docker images and specifying version tags ensures security, reliability, and consistency. Official images are trusted and well-maintained, reducing vulnerabilities. Specific version tags prevent unintended updates and guarantee reproducibility for stable and predictable deployments.

Use Layer Caching

Built-in layer caching mechanism in Docker takes advantage of the layered file system. During builds, unchanged layers are cached, so if you rebuild an image, Docker reuses cached layers, resulting in faster and more efficient builds, especially for large projects. This speeds up development and saves resources.

Use Multi-Stage Builds

Multi-stage builds in Docker allow you to create more efficient and smaller images by separating the build environment from the runtime environment. They offer the following advantages:

  1. Reduced Image Size: Multi-stage builds discard unnecessary build-time dependencies, resulting in smaller final images.

  2. Improved Security: The final image contains only runtime components, reducing potential vulnerabilities from development tools.

  3. Build Isolation: Each build stage is independent, making it easier to manage dependencies and isolate changes.

To achieve multi-stage builds:

  1. Use Multiple FROM Statements: Define different base images for each stage using FROM.

  2. Copy Build Artifacts: In the build stage, copy the required files, install dependencies, and build your application.

  3. Final Stage: In the final stage, use a minimalist base image and copy the built artifacts from the build stage using COPY --from=<stage_name>.

Here's a concise example:

# Build Stage
FROM node:14 AS build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Final Stage
FROM nginx:latest
COPY --from=build /app/dist /usr/share/nginx/html
EXPOSE 80

Use HEALTHCHECK

The HEALTHCHECK instruction defines a command that Docker periodically runs to check a container's health status, ensuring the application is running correctly. If the health check fails, the container is marked as unhealthy, triggering automated actions based on its status.

For more details, refer to docs.docker.com.

Use .dockerignore

Using .dockerignore is crucial for controlling the contents of your Docker image and optimizing the build process. It allows you to exclude specific files and directories from the image, reducing its size, improving build speeds, and enhancing security. Think of .dockerignore as similar to .gitignore for Docker.

Avoid Hardcoding Secrets

Avoiding hardcoding secrets in Dockerfiles is essential for maintaining security and protecting sensitive information. Docker images may be shared, version-controlled, or accessible to unauthorized users, making hardcoding unsafe.

Instead, use external mechanisms to inject secrets at runtime:

  1. Using Environment Variables: Pass secrets as environment variables during container runtime using -e flag.

  2. Using Management Tools: Utilize tools like Azure Key Vault, HashiCorp Vault, or Git-crypt to securely store and manage secrets for retrieval during runtime.

Follow Best Practices for RUN

When using RUN instructions, consider the following best practices:

  • Combine Commands: Minimize RUN instructions by chaining commands with && to reduce image layers and optimize the build process.

  • Clean Up: Remove unnecessary files and caches in the same RUN instruction after installing packages or dependencies to reduce image size.

  • Sort Commands: Order commands based on frequency of change to benefit from Docker's caching mechanism. Place less frequently changing commands before more frequently changing ones.

  • Break Lines: Use \ to break lines and enhance the clarity of your Dockerfile.

Avoid Root Access

Avoiding root access in Dockerfile images is essential for security reasons. Running containers as root increases the risk of potential security vulnerabilities and allows attackers to gain more control over the host system. Using a non-root user reduces the attack surface, enhances container portability, and mitigates security risks.

Final Thoughts

In this brief article, we've covered the basics of Dockerfiles, learning how to build custom images effectively and efficiently.

Next, we'll delve deeper into Docker Compose, Docker's networking features, and its storage solutions. To be notified when the next part is out, consider subscribing to our newsletter!