DockerBeginner5 min read

My 5GB Docker Build: How I Got It Down to 50MB

r5yn1r4143

Apr 18

39 views0 likes0 comments

#docker#container#5gb#heres#shrunk

Okay, so picture this: it was my first week diving headfirst into Docker. I was super hyped, ready to package up my little web app and impress everyone with my newfound container wizardry. I meticulously wrote my Dockerfile, feeling like a coding ninja. I ran docker build ., expecting a sleek, lean image. Then the build finished, and I nervously typed docker images. My jaw hit the floor. 5 GIGABYTES. My app was, like, a few HTML files and a tiny Python script. How on earth did it balloon to the size of a small operating system? This was my "Oops IT" moment, and trust me, it was a big one.

TL;DR: The Big Bloat and the Tiny Fixes

My first Docker build was a whopping 5GB because I basically copied my entire development environment into the image. I learned to:

Use a lean base image: Alpine Linux is your friend. Copy only what you need: Use .dockerignore and specific COPY commands. Clean up after yourself: Remove unnecessary files and cache after installation. Multi-stage builds: Separate build dependencies from the final runtime image. Understand layers: Each command creates a layer; be mindful of how many you're adding.

The "Why Is It So Big?!" Investigation

So, how did we get here? My initial Dockerfile was... enthusiastic. It looked something like this:

# My first, bloated Dockerfile
FROM ubuntu:latest
WORKDIR /app
Copy everything from my local dev environment
COPY . /app
RUN apt-get update && apt-get install -y \
    python3 \
    python3-pip \
    # And a bunch of other dev tools I thought I might need
    git \
    vim \
    curl \
    # ... and so on
    && pip install -r requirements.txtEXPOSE 8000
CMD ["python3", "app.py"]

When I ran docker build ., the output was long, filled with Downloading... for every single package. Ubuntu itself is already a few hundred MB. Then I copied my entire project directory, which included node_modules (a classic bloater!), virtual environments, IDE config files, and probably a few stray cat memes I'd saved. The apt-get install added even more. The requirements.txt might have pulled in heavy libraries. Each command in the Dockerfile creates a new layer. So, my base Ubuntu layer, the copy layer, the apt-get layer, the pip install layer – they all stacked up, and the final image size reflected the sum of all these parts.

The common error message I'd see if I messed up dependencies during pip install was something like:

ERROR: Could not find a version that satisfies the requirement some-super-heavy-package (from versions: none)
No matching distribution found for some-super-heavy-package

This wasn't about size directly, but it highlighted how easily dependencies could creep in. The real size issue came from just blindly copying and installing everything.

Shrinking Down: The Diet Plan for My Docker Image

This is where the real learning happened. I realized I needed to treat my Docker image like a minimalist traveler, packing only essentials.

#### 1. Lean Base Image: Ditching Ubuntu for Alpine

My first step was swapping ubuntu:latest for something much lighter. Alpine Linux is the go-to for this. It's a security-focused, lightweight Linux distribution.

Change: FROM ubuntu:latest to FROM alpine:latest

Alpine uses apk as its package manager, which is super efficient.

#### 2. Selective Copying: The .dockerignore Magic

I was copying way too much. The .dockerignore file is your best friend here. It works just like .gitignore, telling Docker which files and directories to exclude from the build context (the files sent to the Docker daemon).

Create a .dockerignore file:

.git
.vscode
__pycache__
.pyc
.pyo
.pyd
.db
.sqlite3
venv/
node_modules/
docker-compose.yml
README.md

Then, I refined my COPY command. Instead of COPY . /app, I'd be more specific:

# ... (after setting WORKDIR)
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app/

This copies requirements.txt first, installs the dependencies, and then copies the rest of the application code. This is crucial for layer caching – if only your app code changes, Docker won't re-run pip install.

#### 3. Cleaning Up: No Leftovers!

After installing packages with apt-get or apk, there's often a lot of cache data left behind that you don't need in the final image. You can clean this up within the same RUN command to avoid creating extra layers.

For Ubuntu (less relevant now, but good to know):

RUN apt-get update && apt-get install -y --no-install-recommends \
    some-package \
    && rm -rf /var/lib/apt/lists/*

The --no-install-recommends flag is also a great way to avoid installing optional dependencies.

For Alpine:

RUN apk update && apk add --no-cache \
    python3 \
    py3-pip \
    && pip3 install --no-cache-dir -r requirements.txt

The --no-cache flag for apk add is essential for cleaning up the package cache immediately. And --no-cache-dir for pip prevents pip from storing downloaded packages.

#### 4. Multi-Stage Builds: The Ultimate Space Saver

This was the game-changer for more complex applications, especially those involving compilation or build steps. Multi-stage builds let you use one container image to build your application and another, completely clean image to run it.

My final Dockerfile started looking like this:

```dockerfile

--- Builder Stage ---

FROM alpine:latest AS builder

WORKDIR /app

Install build dependencies

RUN apk update && apk add --no-cache \ python3 \ py3-pip \ build-base # Example build tool

COPY requirements.txt . RUN pip3 install --no-cache-dir -r requirements.txt

COPY . .

--- Final Stage ---

FROM alpine:latest

WORKDIR /app

Install only runtime dependencies

RUN apk update