In today's containerized world, effectively monitoring Docker containers has become a critical skill for DevOps engineers and system administrators. While containers offer excellent isolation and resource efficiency, their ephemeral nature presents unique monitoring challenges. This comprehensive guide explores practical approaches to Docker container monitoring with real-world examples and commands you can implement today.
Containers are lightweight, portable, and designed to be ephemeral - they can spin up and down in seconds. This dynamic nature makes traditional monitoring approaches insufficient. Without proper monitoring:
Let's start with the monitoring capabilities that come with Docker itself:
The docker stats
command provides a real-time view of container resource usage:
For specific containers:
docker stats mongodb nginx-proxy
You can even format the output using Go templates:
docker stats --format "{{.Name}}: {{.CPUPerc}} CPU, {{.MemUsage}} MEM"
The docker events
command lets you monitor container lifecycle events in real-time:MediaEdit RelationshipEdit RelationshipSwap UploadSwap UploadRemove UploadRemove Uploadimage-4.pngYou can filter events for specific containers or event types:
docker events --filter 'container=mongodb' \
--filter 'event=health_status'
Container logs are essential for troubleshooting. The docker logs
command provides access to these logs:docker logs mongodbFor continuous monitoring:
docker logs -f mongodb
View only the last 50 lines:
docker logs --tail 50 test
Show logs since a specific timestamp:
docker logs --since 2025-02-27T12:00:00 test
For production environments, consider using the --log-opt
flag when starting containers to implement log rotation:
docker run --name test \
--log-opt max-size=10m \
--log-opt max-file=3 \
-d test:1.23
While Docker's built-in tools are useful for quick checks, a comprehensive monitoring strategy requires dedicated tools.
Google's cAdvisor (Container Advisor) is purpose-built for container monitoring. Here's how to deploy it:
docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \ --detach=true \
--name=cadvisor \
gcr.io/cadvisor/cadvisor:v0.47.0
Once deployed, access cAdvisor's web interface at http://localhost:8080 to view detailed container metrics, including:
For more robust monitoring, the Prometheus and Grafana stack is the industry standard:
docker-compose.yml
file:version: '3.8'
services:
prometheus:
image: prom/prometheus:v2.45.0
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
ports:
- 9090:9090
restart: unless-stopped
node-exporter:
image: prom/node-exporter:v1.6.0
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- 9100:9100
restart: unless-stopped
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.0
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
ports:
- 8080:8080
restart: unless-stopped
grafana:
image: grafana/grafana:10.1.0
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- prometheus
ports:
- 3000:3000
restart: unless-stopped
volumes:
prometheus_data: {}
grafana_data: {}
prometheus.yml
configuration file:global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'docker'
metrics_path: /metrics
static_configs:
- targets: ['172.17.0.1:9323']
/etc/docker/daemon.json
: { "metrics-addr": "0.0.0.0:9323", "experimental": true}
sudo systemctl restart docker
docker-compose up -d
Access Prometheus at http://localhost:9090
and Grafana at http://localhost:3000
(default credentials: admin/admin). Import Docker dashboard templates in Grafana (Dashboard ID: 893 for a comprehensive Docker monitoring dashboard).
For comprehensive log management, the ELK (Elasticsearch, Logstash, Kibana) stack is invaluable:
docker-compose-elk.yml
file:version: '3.8'
services:
elasticsearch:
image: elasticsearch:8.10.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- 9200:9200
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
restart: unless-stopped
logstash:
image: logstash:8.10.0
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
ports:
- 5044:5044
- 5000:5000/tcp
- 5000:5000/udp
depends_on:
- elasticsearch
restart: unless-stopped
kibana:
image: kibana:8.10.0
ports:
- 5601:5601
depends_on:
- elasticsearch
restart: unless-stopped
volumes:
elasticsearch_data: {}
logstash.conf
file:input {
gelf {
port => 12201
type => docker
}
}
filter {
if [type] == "docker" {
mutate {
add_field => {
"container_name" => "%{[container][name]}"
"container_id" => "%{[container][id]}"
"image_name" => "%{[image][name]}"
}
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "docker-logs-%{+YYYY.MM.dd}"
}
stdout { codec => rubydebug }
}
/etc/docker/daemon.json
: {
"metrics-addr": "0.0.0.0:9323",
"experimental": true,
"log-driver": "gelf",
"log-opts": {
"gelf-address": "udp://localhost:12201",
"tag": "{{.Name}}/{{.ID}}"
}
}
docker-compose -f docker-compose-elk.yml up -d
Access Kibana at http://localhost:5601
to search, visualize, and analyze your container logs.
Sometimes, you need custom monitoring solutions. Here's a simple Bash script that monitors container health and sends alerts:
# Container to monitor
CONTAINER_NAME="critical-app"
# Slack webhook for notifications
SLACK_WEBHOOK="https://hooks.slack.com/services/XXXXX/YYYYY/ZZZZZ"
# Check container status
STATUS=$(docker inspect --format='{{.State.Status}}' $CONTAINER_NAME 2>/dev/null)
if [ $? -ne 0 ]; then
MESSAGE="CRITICAL: Container $CONTAINER_NAME not found!"
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" $SLACK_WEBHOOK
exit 1
fi
if [ "$STATUS" != "running" ]; then
# Get exit code if container stopped
EXIT_CODE=$(docker inspect --format='{{.State.ExitCode}}' $CONTAINER_NAME)
MESSAGE="CRITICAL: Container $CONTAINER_NAME is $STATUS (Exit Code: $EXIT_CODE). Attempting restart..."
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" $SLACK_WEBHOOK
# Attempt to restart
docker start $CONTAINER_NAME
else
# Container is running, check health if available
HEALTH=$(docker inspect --format='{{if .State.Health}}{{.State.Health.Status}}{{else}}no-healthcheck{{end}}' $CONTAINER_NAME)
if [ "$HEALTH" == "unhealthy" ]; then
MESSAGE="WARNING: Container $CONTAINER_NAME is unhealthy!"
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" $SLACK_WEBHOOK
fi
fi
# Check resource usage
CPU_USAGE=$(docker stats $CONTAINER_NAME --no-stream --format "{{.CPUPerc}}" | sed 's/%//')
MEM_USAGE=$(docker stats $CONTAINER_NAME --no-stream --format "{{.MemPerc}}" | sed 's/%//')
if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then
MESSAGE="WARNING: Container $CONTAINER_NAME CPU usage at $CPU_USAGE%"
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" $SLACK_WEBHOOK
fi
if (( $(echo "$MEM_USAGE > 85" | bc -l) )); then
MESSAGE="WARNING: Container $CONTAINER_NAME memory usage at $MEM_USAGE%"
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" $SLACK_WEBHOOK
fi
Schedule this script to run via cron:
# Add to crontab to run every 5 minutes
*/5 * * * * /path/to/monitor-containers.sh
FROM nginx:1.23
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \ CMD curl -f http://localhost/ || exit 1
docker run --name webapp \
--label environment=production \
--label app=frontend \
-d nginx:1.23
docker run --name api-service \
--memory=512m \
--memory-swap=1g \
--cpus=0.5 \
-d my-api-image:latest
docker run --name webapp \
--log-driver=json-file \
--log-opt max-size=10m \
--log-opt max-file=3 \
-d nginx:1.23
Effective Docker container monitoring requires a multi-layered approach. While Docker's built-in tools provide basic visibility, comprehensive monitoring demands dedicated solutions like Prometheus, Grafana, and the ELK stack.
By implementing the monitoring strategies and commands outlined in this guide, you'll gain complete visibility into your containerized applications. This improved observability will help you identify issues before they impact users, optimize resource utilization, and maintain a healthy container environment.
Remember that container monitoring is not a set-and-forget task but an ongoing process that should evolve with your infrastructure. As containerization continues to advance, so too should your monitoring practices.