Docker Container Forensics: Understanding Layer Persistence
By Théophile Reverdell
Introduction
In the world of cybersecurity, digital forensics plays a crucial role in investigating security incidents. With the massive adoption of Docker containers in production environments, understanding how to analyze and investigate these containers has become an essential skill.
This article explores the fundamental concepts of Docker container forensics, particularly the functioning of layered architecture and its security implications.
What is Container Forensics?
Container forensics is the art of analyzing Docker images and containers to:
- Identify security vulnerabilities
- Recover sensitive or deleted data
- Understand the build history of an image
- Detect backdoors or malware
- Investigate security incidents
Docker’s Layered Architecture
Operating Principle
Docker uses a layered filesystem where each instruction in a Dockerfile creates a new layer. These layers are:
- Immutable: Once created, a layer cannot be modified
- Stacked: Layers are superimposed on top of each other
- Reusable: Common layers can be shared between images
- Persistent: Data remains in its original layer
Build Example
Take a simple Dockerfile:
FROM debian:latest
RUN apt update && apt install -y curl
COPY secret.txt /app/
RUN rm /app/secret.txt
Each line creates a distinct layer:
- Layer 1: Debian base image
- Layer 2: Curl installation
- Layer 3: secret.txt copy
- Layer 4: secret.txt deletion
The “Deletion” Trap
A Key Forensics Concept
Here’s the crucial point that every security professional must understand: when a file is “deleted” in Docker, it is not actually erased.
The rm command in Layer 4 does not physically delete secret.txt from Layer 3. It simply creates a “whiteout” marker in the overlay filesystem that hides the file in the running container.
Security Implications
This characteristic has major implications:
- Data recovery: Any file added to an image remains accessible, even if “deleted” later
- Exposed secrets: Passwords, API keys, or certificates added temporarily remain in the image
- Audit trail: The complete build history is preserved in the image’s metadata
Anatomy of a Docker Image
File Structure
An exported Docker image (.tar) contains:
my-image.tar/
├── manifest.json # Image metadata
├── config.json # Configuration and history
└── layer1.tar # Layer 1
└── layer2.tar # Layer 2
└── layer3.tar # Layer 3
The Manifest File
The manifest.json contains:
- The ordered list of layers
- Image tags
- The configuration file
The Configuration File
The JSON configuration file reveals:
- Complete command history (
docker history) - Environment variables
- Build metadata
- Entry points and commands
Forensic Analysis Techniques
1. History Inspection
The docker history command reveals all build steps:
docker history my-image:latest
This command displays each layer with the command that created it, allowing identification of suspicious practices.
2. Layer Extraction
Each layer can be extracted and analyzed individually:
# Export the image
docker save my-image:latest -o image.tar
# Extract the archive
tar -xf image.tar
# List layer contents
tar -tf layer.tar
3. Content Analysis
Once layers are extracted, you can:
- Search for sensitive files
- Analyze scripts and configurations
- Identify malicious binaries
- Recover “deleted” data
4. Using Specialized Tools
Tools like dive allow interactive exploration of image layers:
dive my-image:latest
Security Best Practices
1. Multi-Stage Builds
Use multi-stage builds to avoid including sensitive data in the final image:
# Build stage
FROM node:16 AS builder
COPY package.json .
RUN npm install
COPY . .
RUN npm run build
# Production stage (NEW clean image)
FROM node:16-slim
COPY --from=builder /app/dist /app
CMD ["node", "app/server.js"]
Only the last stage creates the final image; intermediate files are not included.
2. Distroless Images: Security Through Minimalism
What is a Distroless Image?
Google introduced the concept of distroless images: container images that contain only your application and its strictly necessary runtime dependencies, without a complete Linux distribution.
What’s absent in a distroless image:
- No shell (bash, sh)
- No package manager (apt, yum)
- No system utilities (ls, cat, wget, curl)
- No unnecessary system libraries
Security and Forensic Advantages
1. Reduced attack surface
# Traditional image: ~100-500 MB with hundreds of packages
FROM node:16
# Distroless image: ~50-100 MB with only the runtime
FROM gcr.io/distroless/nodejs:16
Fewer packages means:
- Fewer potential vulnerabilities (CVEs)
- Fewer attack vectors for intruders
- Fewer binaries usable by an attacker
2. Protection against command execution
Without a shell, an attacker who compromises your application cannot:
- Execute arbitrary commands (
/bin/sh -c "malicious_command") - Download additional tools
- Explore the filesystem interactively
- Launch reverse shells
3. Complexity for attackers
The absence of system tools makes post-exploitation analysis much more difficult:
- Unable to use
psto list processes - No
netstatto identify network connections - No
findto explore the filesystem
Practical Example with Multi-Stage Build
# Stage 1: Build with all necessary tools
FROM node:16 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Stage 2: Final distroless image
FROM gcr.io/distroless/nodejs:16
WORKDIR /app
# Copy only necessary files from builder
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Non-root user (distroless uses UID 65532 by default)
USER nonroot
# Entry point
CMD ["dist/server.js"]
Available Distroless Images
Google provides distroless images for several runtimes:
# Base (libc only)
FROM gcr.io/distroless/base-debian11
# Node.js
FROM gcr.io/distroless/nodejs:16
# Python
FROM gcr.io/distroless/python3
# Java
FROM gcr.io/distroless/java17
# Go (static)
FROM gcr.io/distroless/static-debian11
3. Using Docker Secrets
For sensitive data in production:
# Create a secret
echo "my_password" | docker secret create db_password -
# Use it in a service
docker service create \
--secret db_password \
--name my-app \
my-image:latest
4. Runtime Environment Variables
Inject secrets at runtime, never at build time:
docker run -e DATABASE_PASSWORD="${DB_PASS}" my-image:latest
5. .dockerignore File
Use .dockerignore to avoid copying sensitive files:
.git/
.env
secrets/
*.key
*.pem
6. Security Scanning
Regularly scan your images with tools like:
- Trivy: Vulnerability scanner
- Clair: Static security analysis
- Grype: Vulnerability detection
trivy image my-image:latest
Forensic Use Cases
Incident Response
During a security incident involving a container:
- Preservation: Save the compromised image
- Extraction: Export and extract all layers
- Analysis: Examine the history and content of each layer
- Timeline: Reconstruct the event timeline
- IOC: Identify indicators of compromise
Security Audit
To audit an image before deployment:
- Verify the image origin (trusted registry)
- Analyze the build history
- Search for hardcoded secrets
- Identify known vulnerabilities
- Validate security best practices
Tools and Resources
Analysis Tools
- dive: Interactive layer exploration
- docker-explorer: Forensic framework for containers
- grype/trivy: Vulnerability scanners
- docker-bench-security: Docker configuration audit
Useful Commands
# Inspect an image
docker inspect my-image:latest
# View history
docker history --no-trunc my-image:latest
# Export an image
docker save my-image:latest -o backup.tar
# Analyze with dive
dive my-image:latest
# Scan with trivy
trivy image my-image:latest
Conclusion
Docker container forensics is an essential discipline in the modern cybersecurity landscape. Understanding layered architecture and its implications allows you to:
- Better secure your Docker images
- Effectively investigate incidents
- Avoid sensitive data leaks
- Audit images before deployment
Key takeaways:
- Docker layers are immutable and persistent
- “Deletion” doesn’t actually remove data from previous layers
- Any data added to an image can be recovered
- Use multi-stage builds and distroless images to reduce attack surface
- Distroless images eliminate system tools, making exploitation more difficult
- Regularly audit your images with specialized tools
Container security begins with understanding how they work internally. By applying these principles and best practices, you’ll build more secure Docker images and maintain a robust container environment.
Additional resources:
- Docker documentation on layers
- Google Distroless Images
- OWASP Docker Security Cheat Sheet
- CIS Docker Benchmark
This article is based on forensic analysis concepts practiced on the Root-Me platform, an excellent resource for learning cybersecurity in a practical way.