How to Write Your First Dockerfile (Step-by-Step for Beginners)

Posts

Docker is an open-source platform designed to simplify the process of building, shipping, and running applications inside containers. Containers package an application along with all its dependencies into a standardized unit. This packaging ensures that the application runs consistently across different environments, whether on a developer’s local machine, testing servers, or production systems.

Containerization is a lightweight form of virtualization that isolates applications from one another and the host system. Unlike traditional virtual machines, containers share the host operating system’s kernel but keep applications isolated through namespaces and control groups (groups. This makes containers faster, more efficient, and easier to manage.

The rise of microservices architecture and DevOps practices has increased the importance of containerization. Docker has become a popular tool due to its simplicity, portability, and wide ecosystem support. By using Docker, developers, and operations teams can collaborate more effectively, ensuring smooth deployment pipelines and reproducible environments.

What is a Dockerfile?

A Dockerfile is a text file that contains a series of instructions and commands. These commands define how to build a Docker image, which is a snapshot of a container’s file system and configuration. The Dockerfile serves as a blueprint for creating Docker images, specifying everything from the base operating system image to the application code and required dependencies.

The Dockerfile uses a domain-specific language designed specifically for this purpose. Each instruction in the file creates a layer in the resulting image. Layers enable Docker to reuse unchanged parts of an image during subsequent builds, making the build process efficient and incremental.

When the Docker daemon processes the Dockerfile, it executes each instruction in sequence. This leads to the creation of a Docker image that can be run as a container. Creating a Dockerfile is essential for automating application deployment and for maintaining consistency across different environments.

The Role of Docker Images

Docker images are the building blocks of containers. An image is a lightweight, immutable file that contains everything needed to run a program: the code, runtime, system tools, libraries, and settings. Images are constructed by stacking layers defined by the instructions in a Dockerfile.

Because images are read-only templates, they can be shared, versioned, and distributed easily. Developers often upload images to container registries where they can be accessed and pulled by others. This sharing enables collaboration and rapid deployment.

When an image is executed, it becomes a container — a running instance that isolates the application from the host system. Containers are created from images, and multiple containers can be created from the same image, each running independently.

Understanding the relationship between Dockerfiles, images, and containers is crucial for mastering containerization. Dockerfiles define the image, images are the templates, and containers are the live-running applications based on those templates.

Why Dockerfiles Matter for Developers and DevOps

In modern software development and operations, the ability to package applications and their dependencies into containers is invaluable. Dockerfiles provide the instructions necessary to automate this packaging process, making deployments reliable and repeatable.

Developers benefit from Dockerfiles because they allow environments to be version-controlled alongside application code. This integration ensures that “it works on my machine” issues are greatly reduced. The environment defined in the Dockerfile is consistent across all stages of development and deployment.

For DevOps engineers, Dockerfiles enable the automation of build and deployment pipelines. By defining the environment and building steps declaratively, teams can integrate Docker image creation into CI/CD workflows. This automation accelerates release cycles and improves reliability.

The demand for professionals skilled in Docker and containerization technologies is growing rapidly. Mastery of Dockerfiles equips developers and DevOps engineers with the tools to build scalable, portable, and maintainable applications in today’s fast-paced software landscape.

Preparing the Environment for Dockerfile Creation

Before creating a Dockerfile for your application, it is essential to prepare your development environment. The first step is to install Docker Desktop on your system. Docker Desktop provides the Docker Engine, command-line interface, and GUI tools necessary to build, run, and manage containers.

Once Docker is installed and running, you should create or identify a project directory that contains the application you want to containerize. For this example, a simple Node.js application will be used. The project should include all necessary files such as package.json for dependency management and your application source code files, typically JavaScript files.

Having a clean and organized project directory makes it easier to write a Dockerfile that copies all required files into the container image.

Defining the Base Image

Every Dockerfile starts with a base image instruction. This instruction specifies the starting point for your custom image. For Node.js applications, using an official Node.js image as the base is recommended because it includes the Node runtime and essential libraries.

For example, the command:

FROM node:20-alpine

Selects the Node.js version 20 image based on the Alpine Linux distribution. Alpine is a lightweight Linux distribution optimized for small size and security, which helps keep the resulting Docker image minimal.

Selecting an appropriate base image is crucial because it affects image size, build time and available software within the container.

Setting the Working Directory

The next step in the Dockerfile is to define the working directory inside the container using the WORKDIR instruction. This directory serves as the default location for all subsequent commands to be executed.

For instance:

WORKDIR /app

Creates a directory named /app inside the container and switches to it. By doing so, commands like copying files or running scripts will operate relative to this directory.

Using a working directory helps maintain a clean container file system and organizes application files logically.

Copying Application Files into the Container

Once the base image and working directory are set, the Dockerfile should include instructions to transfer the application’s source code and related files from the host machine into the container image.

This is done with the COPY instruction:

COPY . .

The first. Refers to the current directory on the host system, and the second. Refers to the current working directory inside the container (as defined by the WORKDIR).

This instruction copies all files and folders in the project directory into the container’s /app directory, ensuring the application code is available to run inside the container.

Installing Dependencies

After copying the files, the application’s dependencies must be installed so that the application can function correctly within the container.

Using the Node.js example, the RUN instruction executes commands during the image build process. To install dependencies using Yarn, the Dockerfile includes:

RUN yarn install –production

This command tells Docker to run yarn install with the –production flag, which installs only the dependencies required for running the application, excluding development dependencies.

Installing dependencies during the build step ensures that the container image contains everything necessary to run the app.

Specifying the Container Startup Command

The final instruction in the Dockerfile usually defines the command to run when the container starts. This is done using the CMD instruction.

For the Node.js application, the startup command might look like this:

CMD [“node”, “./src/index.js”]

This command launches the Node.js runtime and executes the index.js file located in the src directory.

The CMD instruction tells the container what process to run by default when it is started, enabling the application to begin execution immediately.

Complete Example of the Dockerfile for Node.js Application

Combining all the instructions discussed, the Dockerfile for the Node.js application will appear as follows:

FROM node:20-alpine
WORKDIR /app
COPY . .
RUN yarn install –production
CMD [“node”, “./src/index.js”]

This Dockerfile defines a complete process to build a lightweight and efficient container image that packages the Node.js application along with its runtime environment and dependencies.

Building and Running the Docker Image

After writing the Dockerfile, the next step is to build the Docker image. This is done using the Docker CLI command:

Docker build -t your-image-name.

Here, the -t flag tags the image with a name, making it easier to reference later. The dot. Tells Docker to look for the Dockerfile in the current directory.

Once the image is built successfully, you can run a container based on the image using:

Docker run your-image-name

This launches the container and executes the application inside it. Using this workflow, developers can quickly package their Node.js applications into portable containers for development, testing, or production deployment.

Creating a Dockerfile for an Ubuntu Container

Beyond application-specific containers like Node.js, Dockerfiles can also be used to build containers based on different operating systems. Ubuntu, a widely used Linux distribution, is a common choice for such containers. Creating a Dockerfile to run an Ubuntu container allows users to work within an isolated Ubuntu environment on any host system.

To start, the Dockerfile must specify the Ubuntu base image. This can be done by using the instructions:

FROM ubuntu: latest

This command tells Docker to use the latest official Ubuntu image as the base for your container. The Ubuntu image contains the core components of the Ubuntu operating system but nothing more.

Using this base image, you can add instructions to customize the environment according to your needs.

Pulling and Managing Ubuntu Images

Before building a Docker image from your Dockerfile, it is useful to manually pull the Ubuntu image to your system to ensure availability. This is done with the command: Docker pull ubuntu

This command downloads the latest Ubuntu image from Docker Hub to your local machine. Once pulled, you can verify the image’s presence by running:

docker images

This will list all Docker images available locally, including the Ubuntu image you just pulled. Managing images effectively is important to keep your environment clean and avoid unnecessary downloads.

Building and Running the Ubuntu Container

With the Ubuntu Dockerfile prepared, you can build the image using:

Docker build -t my-ubuntu.

The -t flag assigns the name “my-ubuntu” to your image, and the dot signifies the current directory containing the Dockerfile.

If there are issues with Docker finding the correct directory or building the image, these might be related to Docker’s default storage locations. The command:

docker info

Provides detailed information about Docker’s configuration, including the Docker Root Dir, where images and containers are stored.

Ensuring the correct paths and permissions helps avoid build errors.

Once the image is built, run the container with:

Docker run my-ubuntu

This launches an interactive Ubuntu container, allowing you to execute commands within the isolated environment.

Building a Docker Image from Scratch

Creating a Docker image from scratch might seem daunting at first, especially to beginners. However, understanding the principles and steps behind this process reveals the power and flexibility that Docker offers for building and deploying applications. This guide will explore what it means to build an image from scratch, why and when you might do it, and the step-by-step process to create a custom Docker image tailored to your needs.

What Does “From Scratch” Mean in Docker?

In Docker terminology, “from scratch” refers to building a Docker image that starts with no base image at all. Typically, most Dockerfiles start with a base image such as an operating system (like Ubuntu or Alpine Linux) or language runtime (like Node.js or Python). These base images provide a foundation of libraries, utilities, and file systems on which you build your application layers.

Starting from scratch means you don’t rely on any existing image layers. Instead, you create the minimal environment you need for your application, often just the binaries and libraries required to run your program. This results in an ultra-lightweight image because it contains only what you explicitly add.

Why Build an Image from Scratch?

There are several reasons why developers choose to build Docker images from scratch:

  • Minimal Size: Since you control every byte, images built from scratch can be extremely small, improving download and startup times. This is especially useful for microservices or serverless applications where quick deployment matters.
  • Security: By including only the necessary files and binaries, you reduce the attack surface. Fewer packages or components could potentially contain vulnerabilities.
  • Custom Environments: Some applications require very specific environments that aren’t available or easily customizable in base images. Building from scratch offers total control.
  • Learning and Experimentation: It’s an excellent exercise for understanding how Linux systems and Docker images work under the hood. It helps deepen knowledge about containerization and system dependencies.

Despite these advantages, building from scratch is more complex and time-consuming. It requires a good grasp of system dependencies and often manual inclusion of required files.

The Core Concept: Layering and Filesystem in Docker Images

Docker images are made up of layers stacked on top of each other, forming a complete filesystem when combined. When you start with an existing base image, you inherit many layers containing a full operating system, utilities, and libraries.

When building from scratch, you start with an empty image—no layers or filesystem. The Dockerfile instructions you write create layers by adding files or executing commands. Each layer is a snapshot of the filesystem changes made by that instruction.

Building a usable image from scratch means you must include all necessary files and directories your application depends on, including the binaries, libraries, and even configuration files. Otherwise, the image will lack the environment needed to run.

Step-by-Step Process of Building an Image from Scratch

Step 1: Define Your Application Requirements

Begin by understanding what your application requires to run. What binaries, libraries, and configuration files are essential? For example, if you are packaging a simple Go binary that is statically compiled, you might only need the binary itself because it contains everything needed to execute.

For interpreted languages like Python or Node.js, you may need to include the interpreter and required libraries, which can get complex if built from scratch.

Step 2: Create a Dockerfile with the Scratch Base

When building from scratch, the Dockerfile starts with the special keyword FROM scratch. This signals to Docker that the image will not start from any existing layers.

Step 3: Add Necessary Files Manually

Because there is no underlying filesystem, every file and folder your application needs must be explicitly added. This often includes:

  • Your application binary or code
  • Configuration files
  • Required libraries or shared objects
  • Any necessary certificates or keys

You must copy these files into appropriate directories inside the image. For example, your application binary might go into /app/ or /usr/local/bin.

Step 4: Set Execution Parameters

Since there are no default shells or utilities, you usually specify the entry point explicitly to run your application. You use the ENTRYPOINT or CMD instruction to set the command that starts your app when the container runs.

Because there is no shell, these commands must be written in exec form, which means providing the command and its arguments as a JSON array.

Step 5: Build the Image

After writing your Dockerfile, you build the image using the Docker build command. Docker executes each instruction in your Dockerfile, adding layers for the files you copy or commands you run.

The resulting image will be extremely minimal and contain only the files you added, making it as lean as possible.

Example Use Cases for Images from Scratch

  • Statically Compiled Binaries: Languages like Go allow compiling statically linked binaries that don’t require any dynamic libraries. Packaging such binaries in a scratch image produces tiny, fast containers.
  • Security-Sensitive Applications: Applications that require a locked-down environment with zero unnecessary dependencies.
  • Microservices: Small, focused services that benefit from small image sizes and fast startup.
  • Custom Runtime Environments: Where base images do not offer the necessary customization.

Challenges and How to Overcome Them

Building from scratch has challenges, including:

  • Missing Libraries: If your application depends on system libraries, you must include them manually. Determining the dependencies and copying the correct versions can be tricky. Tools like add (to the list of dynamic dependencies) help in identifying required libraries.
  • No Shell or Utilities: Scratch images contain no shell, package manager, or basic utilities. This means no bash scripting or installing packages inside the container. Everything must be prepared before building the image.
  • Debugging Difficulties: Because the environment is so minimal, debugging can be more difficult. You can’t simply shell into the container unless you add a shell binary manually. Debugging often involves careful local testing and stepwise addition of files.

Best Practices When Building Images from Scratch

  • Statically Compile Your Binaries: Where possible, build statically linked executables so they don’t depend on external libraries. This simplifies the image and avoids missing dependency issues.
  • Include Only Essential Files: Avoid copying unnecessary files or directories. Minimalism reduces size and attack surface.
  • Test Locally: Thoroughly test your application and dependencies in a similar minimal environment before building the image.
  • Use Multi-Stage Builds: Multi-stage Docker builds allow you to compile or prepare your application in one stage with all necessary tools, then copy only the final artifacts into the scratch image, combining ease of building with minimal final image size.
  • Document Dependencies: Maintain clear documentation of what files and libraries your image requires to avoid confusion and facilitate future updates.

Multi-Stage Builds to Facilitate Scratch Images

One common strategy to simplify building from scratch is to use multi-stage builds. This means the Dockerfile contains multiple FROM instructions. The first stage uses a full-featured base image with build tools and dependencies. Your application is built here.

The final stage starts FROM scratch and copies only the built artifacts from the first stage. This approach lets you keep your final image minimal while still benefiting from a full development environment during the build.

Building a Docker image from scratch provides unmatched control over the image contents and size, resulting in ultra-lightweight and secure containers. While it requires detailed knowledge of your application’s dependencies and Linux system internals, the benefits include minimal attack surface, fast deployment, and efficiency.

By understanding the steps—from defining requirements, manually adding files, and configuring the entry point, to building and testing—you can create highly optimized Docker images perfectly tailored to your application. Combining this approach with best practices and tools like multi-stage builds unlocks the full potential of Docker in creating clean, production-ready containers.

Writing Dockerfile Instructions for Custom Applications

When creating Dockerfiles for custom applications, it is important to include instructions that set up the environment correctly.

Common instructions include:

  • Specifying the base image (FROM)
  • Setting the working directory (WORKDIR)
  • Copying application files (COPY)
  • Installing dependencies (RUN)
  • Defining environment variables (ENV)
  • Exposing ports (EXPOSE)
  • Setting the startup command (CMD or ENTRYPOINT)

Each of these instructions contributes to a Docker image tailored to the needs of the application.

Example: Dockerfile for Python Application on Ubuntu Base

Here is an example Dockerfile for a simple Python application running on an Ubuntu base:

FROM ubuntu: latest
WORKDIR /app
COPY . .
RUN apt-get update && apt-get install -y python3 python3-pip
CMD [“python3”, “app.py”]

This Dockerfile updates the Ubuntu package manager, installs Python 3 and pip, copies the application files, and runs the Python script app.py when the container starts.

This example highlights how Dockerfiles provide flexibility for different programming languages and environments.

Testing and Verifying Docker Images

After building a Docker image, it is crucial to test and verify its functionality. Use the command:

docker images

to list the created images and confirm the presence of your newly built image.

Run the container using:

Docker run -d -p 8080:80 your-image-name

This runs the container in detached mode and maps port 8080 on the host to port 80 inside the container. Verifying network and application behavior ensures the container works as expected.

Testing containers regularly helps maintain reliability throughout development and deployment.

Automating Docker Image Creation with CI/CD Pipelines

In modern software development, automating the creation of Docker images is essential for efficiency and consistency. Continuous Integration and Continuous Deployment (CI/CD) pipelines enable automatic building, testing, and deployment of container images whenever code changes are pushed to a repository.

Tools like GitHub Actions, GitLab CI, Jenkins, and others provide workflows to automate Docker image creation. These pipelines typically pull the latest code, build the Docker image using the Dockerfile, run tests within containers, and push the image to a container registry if everything succeeds.

Automation reduces human error, speeds up delivery, and ensures that the images deployed to production are always built from the latest tested code.

Understanding Dockerfile Language and Syntax

Dockerfiles are at the heart of Docker’s containerization process. They provide a set of instructions that Docker uses to build images—portable, lightweight environments that encapsulate an application and all its dependencies. Mastering the Dockerfile language and syntax is essential for creating efficient, reliable, and maintainable container images.

What Is a Dockerfile?

A Dockerfile is a simple text file that contains a series of commands and instructions. These commands define how the Docker image should be built, specifying everything from the base operating system to the application dependencies and the startup behavior of the container.

Dockerfiles are written in a domain-specific language (DSL) designed specifically for Docker. This DSL is not a general-purpose programming language, but a structured set of keywords and arguments that guide the image-building process.

Basic Structure of a Dockerfile

Each Dockerfile consists of a sequence of instructions. The general structure follows the format where each instruction begins with a keyword followed by arguments that define what Docker should do. Each instruction tells Docker to perform a particular step in building the image.

Some of the most common instructions include:

  • FROM, which specifies the base image to start with.
  • RUN, which executes a command during the build process.
  • COPY, which copies files from the host machine into the image.
  • WORKDIR, which sets the working directory for following commands.
  • CMD, which defines the default command to run when the container starts.
  • ENTRYPOINT, which defines the main executable for the container.
  • ENV, which sets environment variables.
  • EXPOSE, which declares the ports the container listens on.
  • ARG, which defines variables available during the build.

Docker processes these instructions line-by-line, creating intermediate images for each step. These layers are cached, which makes subsequent builds faster if no changes occur in those layers.

Common Dockerfile Instructions Explained

FROM

The FROM instruction is always the first line of a Dockerfile. It specifies the base image on which the rest of the image will be built. Base images can be official Linux distributions like Ubuntu or Alpine, language-specific environments such as Node.js or Python, or even custom images created previously.

For example, you might start with a lightweight Alpine Linux image that includes a specific version of Node.js.

RUN

The RUN instruction executes commands during the image build. It allows the installation of software, downloading dependencies, or configuring the system.

Each RUN instruction creates a new image layer, so it is efficient to combine multiple commands into one RUN step where possible. For example, running system package updates and installing software in a single RUN reduces the number of layers and keeps the image size smaller.

COPY and ADD

COPY and ADD both copy files from the host into the image, but ADD has extra features such as downloading remote URLs and unpacking compressed files. Generally, COPY is preferred when simply copying local files.

These instructions enable bringing application code, configuration files, or assets into the container environment.

WORKDIR

WORKDIR sets the directory where subsequent commands will be executed. If the directory does not exist, Docker will create it automatically.

Setting the working directory ensures that relative paths in commands or scripts work correctly and keeps the Dockerfile cleaner.

CMD and ENTRYPOINT

CMD and ENTRYPOINT define what happens when a container starts.

CMD specifies the default command and arguments but can be overridden when running the container. ENTRYPOINT defines the executable that always runs, with CMD providing default arguments that can be replaced.

Understanding the difference between these two is important for container behavior, especially when making images reusable or configurable.

Layered File System and Build Caching

Docker images are built as layers, each representing one instruction in the Dockerfile. This layered structure allows Docker to reuse intermediate image layers that have not changed between builds, speeding up the build process.

However, the order of instructions matters. For example, if you place a COPY instruction that changes frequently before installing dependencies, Docker will invalidate the cache for all subsequent layers, forcing a full rebuild. Therefore, placing commands that rarely change near the top is a good practice.

Best Practices in Dockerfile Syntax and Language

Writing Dockerfiles with care improves performance, maintainability, and security. Some best practices include:

  • Minimize the number of layers by combining related commands.
  • Place instructions that change less frequently towards the beginning of the Dockerfile.
  • Use official base images that are maintained and secure.
  • Exclude unnecessary files from the build context using a .dockerignore file to reduce image size and build time.
  • Use multi-stage builds to separate the build environment from the runtime environment, producing smaller final images.

Environment Variables and ARG in Dockerfiles

Dockerfiles support the use of variables to make builds more flexible.

ENV sets environment variables inside the container that persist at runtime. These can configure the application or system behavior.

ARG defines build-time variables available only during the build process. These are useful for customizing builds without hardcoding values.

Variables can be referenced throughout the Dockerfile to parameterize commands and paths.

Comments and Documentation in Dockerfiles

Comments begin with the # character and are used to explain instructions or sections. Including comments improves readability and helps other developers understand the purpose of each step.

Well-commented Dockerfiles aid collaboration and long-term maintenance, especially in complex projects.

Common Pitfalls and Debugging Dockerfiles

Some frequent issues when writing Dockerfiles include unnecessarily large images caused by many layers or copying unneeded files, build cache invalidation due to improper ordering of instructions, missing dependencies that cause runtime failures, and incorrect use of CMD and ENTRYPOINT leading to unexpected container behavior.

Debugging tips include disabling the build cache to force a fresh build, inspecting intermediate layers, and running containers interactively to diagnose problems inside the container environment.

The Dockerfile language and syntax provide a powerful yet simple way to define container images. Each instruction plays a specific role in shaping the image, from the base system and dependencies to runtime configuration and startup commands.

Understanding how Docker processes these instructions, the impact of layering and caching, and best practices in writing Dockerfiles will help you create efficient, maintainable, and reliable containerized applications.

Mastery of Dockerfiles empowers developers and engineers to create consistent environments that enhance development speed, testing reliability, and deployment flexibility.

The Role of Go Language in Docker

Docker itself is built primarily using the Go programming language. Go was chosen for its efficiency, concurrency support, and suitability for system-level programming.

By using Go, Docker takes advantage of features in the Linux kernel such as namespaces and control groups (groups). Namespaces isolate processes and resources, while groups manage resource allocation, enabling containers to run independently and securely on the same host.

This technical foundation allows Docker to deliver lightweight containers with minimal overhead compared to traditional virtual machines.

Namespaces and Container Isolation

Namespaces are a core Linux kernel technology that Docker uses to isolate containers from each other and the host system. They provide separate views of system resources such as process IDs, network interfaces, mount points, and user IDs.

There are several types of namespaces including:

  • PID namespace: isolates process IDs
  • NET namespace: isolates network interfaces
  • MNT namespace: isolates filesystem mounts
  • UTS namespace: isolates hostname and domain name
  • IPC namespace: isolates interprocess communication

Using namespaces, each container behaves as if it has its isolated system environment, improving security and stability.

Control Groups (groups) for Resource Management

Control groups, or groups, are another Linux kernel feature utilized by Docker to limit and monitor resource usage by containers. Cgroups allow Docker to allocate CPU, memory, disk I/O, and network bandwidth to individual containers.

This ensures that no single container can exhaust system resources and negatively impact other containers or the host system. Proper resource management is crucial for running multiple containers on shared infrastructure.

By combining namespaces and cgroups, Docker achieves effective container isolation, resource control, and performance.

Final Thoughts

Mastering Docker involves understanding the interplay between Dockerfiles, images, and containers. Dockerfiles define the instructions to create images, images are portable snapshots containing applications, environments. Containers are live instances of these images running isolated processes.

Docker leverages Linux kernel features like namespaces and groups, implemented using Go, to deliver lightweight and efficient containerization technology. Automated workflows and CI/CD pipelines further enhance the development and deployment process.

This knowledge equips software developers and DevOps engineers to build modern applications with consistency, scalability, and speed.