Pull a random Node.js Dockerfile from GitHub and there is a reasonable chance it produces a 1.2 GB image. The API it serves handles simple CRUD requests. The actual compiled application logic is maybe 50 MB. The rest is node_modules, build tools, OS packages, and development dependencies that have no business being in production.
This is not a minor inefficiency. Large images slow down CI/CD pipelines, increase registry storage costs, slow deployments, and increase the attack surface of your containers. A 100 MB image deploys in 15 seconds. A 1.2 GB image takes 2-3 minutes. Over hundreds of deployments, that’s days of wasted time.
Why Images Bloat
The most common causes, in order of frequency:
1. Starting from a full OS image
# This is a 900MB+ starting point
FROM node:18
The node:18 image is based on Debian and includes a complete OS with compilers, headers, package managers, and utilities you will never use in production. The Alpine equivalent is 50 MB.
2. Installing dev dependencies in production
COPY package*.json ./
RUN npm install # installs ALL dependencies including devDependencies
npm install with no flags installs everything in package.json, including your test frameworks, type definitions, build tools, and linters. A well-organized project might have 200 MB of prod dependencies and 600 MB of dev dependencies.
3. Not using .dockerignore
Without a .dockerignore, COPY . . copies your node_modules/, .git/, test fixtures, local env files, and everything else in your project directory into the image. The node_modules/ folder alone is often 300-500 MB.
4. Layer bloat from not chaining commands
# Each RUN creates a new layer - the files from apt-get exist forever even after rm
RUN apt-get update
RUN apt-get install -y git
RUN rm -rf /var/lib/apt/lists/*
Even though the final rm removes the apt cache, it creates a new layer. The data from the previous layers still exists in the image. You must chain these commands.
5. No multi-stage builds
Building a compiled artifact (TypeScript to JavaScript, Go binary, Java JAR) requires build tools that aren’t needed to run the result. Without multi-stage builds, build tools end up in your production image.
The Fixes
Use Alpine or Distroless Base Images
| Base Image | Size |
|---|---|
node:18 |
~950 MB |
node:18-slim |
~240 MB |
node:18-alpine |
~55 MB |
gcr.io/distroless/nodejs18 |
~115 MB |
Start here. node:18-alpine alone reduces your base from 950 MB to 55 MB.
Distroless images (from Google) go further - they contain only the runtime, no shell, no package manager. Harder to debug but maximally minimal and secure.
Multi-Stage Builds
The right pattern for a TypeScript application:
# Stage 1: Build
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: Production
FROM node:18-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY --from=builder /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]
What this does:
- Stage 1 installs everything and builds the TypeScript
- Stage 2 starts fresh with only production dependencies
- The build tools, TypeScript source, and dev dependencies never make it into the final image
For Go, the improvement is even more dramatic:
# Stage 1: Build
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o main ./cmd/server
# Stage 2: Production
FROM scratch
COPY --from=builder /app/main /main
EXPOSE 8080
CMD ["/main"]
FROM scratch is literally an empty image. A Go binary compiled with CGO_ENABLED=0 is fully self-contained. The resulting image is the size of the binary - typically 5-15 MB.
.dockerignore Is Non-Negotiable
node_modules/
.git/
.gitignore
*.md
*.log
.env
.env.*
dist/
build/
coverage/
.nyc_output/
test/
tests/
__tests__/
*.test.ts
*.spec.ts
.eslintrc*
.prettierrc*
jest.config.*
tsconfig*.json
This prevents COPY . . from pulling in hundreds of MB of files that don’t belong in the image.
Chain Your RUN Commands
# Wrong
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Right
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
The --no-install-recommends flag prevents apt from pulling in suggested packages you didn’t ask for.
Size Targets by Application Type
| Application Type | Reasonable Production Size |
|---|---|
| Simple Node.js API | 80-150 MB |
| Python Flask/FastAPI app | 100-200 MB |
| Go HTTP server | 10-25 MB |
| Java Spring Boot | 150-300 MB |
| Static site (nginx) | 25-50 MB |
If your image is more than 3x these numbers, something is wrong.
Audit Your Existing Images
# See layer sizes
docker history my-image:latest
# Dive tool - interactive layer explorer
docker run --rm -it wagoodman/dive my-image:latest
dive is the single most useful tool for understanding why an image is large. It shows you what each layer adds, which files are duplicated, and what percentage of the image is “wasted” by layer artifacts.
Bottom Line
A 100 MB production image versus a 1.2 GB production image is a 12x improvement in deploy time, a 12x reduction in registry bandwidth, and a meaningfully smaller attack surface. The techniques required - multi-stage builds, Alpine base images, proper .dockerignore, chained RUN commands - take about 30 minutes to implement once and benefit every deployment forever. There is no good reason not to do this.
Comments