Container Analyst Playbook

Externalized detailed command sequences from the agent definition per the few-shot-examples rule (#1

Container Analyst — Detection Playbook

Externalized detailed command sequences from the agent definition per the few-shot-examples rule (#1587, #1600). The agent references this on demand; it is not loaded into the system-prompt budget.

The agent definition keeps the enumerated detection targets, escape-vector criteria, and gates inline. This file carries the exact command sequences per runtime/tool.


1. Container Inventory

Document every container — running, stopped, and exited. Attackers frequently leave behind stopped containers, and image history reveals what was installed.

# All containers including stopped and exited
docker ps -a --format "table {{.ID}}\	{{.Image}}\	{{.Status}}\	{{.Names}}\	{{.CreatedAt}}"

# Container creation timestamps (detect unauthorized container launches)
docker ps -a --format "{{.CreatedAt}}\	{{.Names}}\	{{.Image}}\	{{.Status}}" | sort

# Images present on the host
docker images --format "table {{.Repository}}\	{{.Tag}}\	{{.ID}}\	{{.CreatedAt}}\	{{.Size}}"

# Dangling images (may contain attacker-built images)
docker images --filter "dangling=true"

# Docker volumes
docker volume ls
docker volume inspect $(docker volume ls -q) 2>/dev/null

# Docker networks
docker network ls
docker network inspect $(docker network ls -q) 2>/dev/null

containerd / CRI-O environments: If the host uses containerd or CRI-O directly (common in Kubernetes nodes without Docker), use `crictl` from the CRI-O project. The `crictl` CLI speaks the Container Runtime Interface and works against both runtimes.

# Check which runtime is in use
ctr version 2>/dev/null       # containerd
crictl version 2>/dev/null    # CRI-compatible runtime (containerd or CRI-O)

# List all pods and containers (crictl equivalents of docker ps)
crictl pods
crictl ps -a

# Inspect a specific container (replace <id> with container ID from crictl ps)
crictl inspect <id>

# List images on the node
crictl images

# Inspect an image
crictl inspecti <image-id>

# Pull container logs
crictl logs <container-id>

# Container stats
crictl stats -a

Flag any container that is exited and has a creation time coinciding with the incident window. Exited containers retain their filesystem layers — evidence that would be lost if the container were removed.

2. Privilege Escalation Vector Detection

Container misconfigurations are a primary attack vector. Identify every configuration that weakens container isolation.

# Privileged containers — full host access
docker inspect $(docker ps -aq) 2>/dev/null | \
  python3 -c "import sys,json; data=json.load(sys.stdin); \
  [print(c['Name'], 'PRIVILEGED') for c in data if c['HostConfig']['Privileged']]"

# Alternatively, per-container inspection
for id in $(docker ps -aq); do
  name=$(docker inspect "$id" --format '{{.Name}}')
  priv=$(docker inspect "$id" --format '{{.HostConfig.Privileged}}')
  user=$(docker inspect "$id" --format '{{.Config.User}}')
  echo "$name | Privileged: $priv | User: $user"
done

# Containers with host namespace sharing
docker inspect $(docker ps -aq) --format \
  '{{.Name}} PID:{{.HostConfig.PidMode}} Net:{{.HostConfig.NetworkMode}} IPC:{{.HostConfig.IpcMode}}' \
  2>/dev/null | grep -E "host|pid"

# Containers with host filesystem mounts
docker inspect $(docker ps -aq) --format \
  '{{.Name}} Mounts:{{range .Mounts}}{{.Source}}:{{.Destination}}:{{.RW}} {{end}}' \
  2>/dev/null | grep -v "Mounts: $"

# Dangerous capability additions
docker inspect $(docker ps -aq) --format \
  '{{.Name}} CapAdd:{{.HostConfig.CapAdd}}' 2>/dev/null | grep -v "CapAdd:\[\]"

# SYS_ADMIN is effectively privileged
docker inspect $(docker ps -aq) --format \
  '{{.Name}} CapAdd:{{.HostConfig.CapAdd}}' 2>/dev/null | grep -i "SYS_ADMIN\|SYS_PTRACE\|NET_ADMIN"

A privileged container, a container with `--pid=host`, or a container mounting `/` from the host — any of these is a confirmed container escape vector. Treat as critical.

3. Image Integrity Verification

Verify that running images match their expected digests. Image tampering is a supply chain attack vector.

# Image digests for all local images
docker images --digests --format "table {{.Repository}}\	{{.Tag}}\	{{.Digest}}\	{{.ID}}"

# History of each image (reveals what was installed layer by layer)
for img in $(docker images -q); do
  echo "=== Image: $img ==="
  docker history "$img" --no-trunc
done

# Inspect image labels (may contain provenance information)
docker inspect $(docker images -q) --format \
  '{{.RepoTags}} Labels:{{.Config.Labels}}' 2>/dev/null

# Check for images built locally (no registry digest — higher risk)
docker images --digests | grep "<none>" | grep -v REPOSITORY

# Image build history — look for unusual RUN commands
for img in $(docker images -q); do
  name=$(docker inspect "$img" --format '{{index .RepoTags 0}}')
  unusual=$(docker history "$img" --no-trunc --format "{{.CreatedBy}}" | \
    grep -E "curl|wget|pip install|apt-get.*install" | head -5)
  [ -n "$unusual" ] && echo "=== $name ===" && echo "$unusual"
done

An image without a registry digest was built locally. Local builds without a Dockerfile in version control are suspicious. Layer commands that download scripts from the internet without pinned versions are a supply chain risk.

4. Image Layer Analysis with dive

`dive` provides interactive layer-by-layer inspection of image filesystems, making it easier to spot files added in unexpected layers or layers that attempt to hide evidence by deleting files from prior layers.

# Install dive if not present (forensic read-only — does not modify images)
# https://github.com/wagoodman/dive — verify checksum before use
dive --version 2>/dev/null || echo "dive not installed — install from https://github.com/wagoodman/dive/releases"

# Analyze a specific image non-interactively (CI mode outputs a pass/fail summary)
dive <image-id-or-tag> --ci

# Inspect layer-by-layer file changes for a pulled image
dive docker://<repository>:<tag>

# Identify layers that remove files (potential evidence wiping in supply chain attacks)
# dive --ci exits non-zero if efficiency thresholds fail — useful for scripted checks
for img in $(docker images -q); do
  tag=$(docker inspect "$img" --format '{{index .RepoTags 0}}' 2>/dev/null)
  echo "=== Analyzing: ${tag:-$img} ==="
  dive "$img" --ci --lowestEfficiency=0.9 2>&1 | grep -E "FAIL|efficiency|wasted" || true
done

Look for:

  • Layers that `rm -rf` files immediately after downloading them — an attempt to hide tooling while keeping its effects
  • Layers that install tools (`curl`, `nmap`, `nc`, `socat`) not present in the image's declared purpose
  • Unexpectedly large layers that do not correspond to documented application dependencies
  • Files with world-writable permissions set in a layer after a base image was pulled

5. Volume and Mount Analysis

Volumes persist data outside container lifecycles. Attackers use volumes to store tools, exfiltrate data, or maintain persistence across container restarts.

# All volume mounts across containers
docker inspect $(docker ps -aq) --format \
  '{{.Name}}{{range .Mounts}} | {{.Type}}:{{.Source}}->{{.Destination}} RW:{{.RW}}{{end}}' \
  2>/dev/null

# Sensitive host paths mounted into containers
docker inspect $(docker ps -aq) --format \
  '{{.Name}}{{range .Mounts}}{{if .Source}} {{.Source}}{{end}}{{end}}' \
  2>/dev/null | grep -E "/etc|/root|/home|/var/run/docker.sock|/proc|/sys"

# Docker socket mounted inside containers (critical — allows container escape)
docker inspect $(docker ps -aq) --format \
  '{{.Name}}{{range .Mounts}}{{if eq .Source "/var/run/docker.sock"}} DOCKER-SOCKET-MOUNTED{{end}}{{end}}' \
  2>/dev/null | grep SOCKET

# Named volume contents — examine for attacker tools
for vol in $(docker volume ls -q); do
  echo "=== Volume: $vol ==="
  docker run --rm -v "$vol:/vol" alpine ls -la /vol 2>/dev/null
done

The Docker socket mounted inside a container is a full host escape. Any container with `/var/run/docker.sock` mounted can control the Docker daemon and create privileged containers with host filesystem access.

6. Container Network Analysis

Container networking can be used to pivot between services that are not exposed to the host network.

# All container networks and their connected containers
docker network inspect bridge host none \
  $(docker network ls -q --filter "driver=overlay") 2>/dev/null | \
  python3 -c "import sys,json; \
  data=json.load(sys.stdin); \
  [print(n['Name'], list(n.get('Containers',{}).keys())) for n in data]"

# Exposed and published ports
docker inspect $(docker ps -aq) --format \
  '{{.Name}} Ports:{{.HostConfig.PortBindings}}' 2>/dev/null | grep -v "Ports:map\[\]"

# Container IP addresses
docker inspect $(docker ps -aq) --format \
  '{{.Name}} IP:{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 2>/dev/null

# Inter-container communication — identify which containers can reach which
docker network inspect $(docker network ls -q) --format \
  '{{.Name}} Containers:{{range .Containers}}{{.Name}}({{.IPv4Address}}) {{end}}' 2>/dev/null

7. eBPF Runtime Monitoring

eBPF-based security tools — Tetragon, Falco, and Tracee — capture runtime behavior inside containers at the kernel level. If any of these tools were active during the incident, their logs are high-fidelity forensic evidence. If they were not active, consider deploying them against still-running suspicious containers.

Falco

Falco monitors system calls and produces alerts for policy violations. Check for existing Falco alert logs first.

# Falco service logs (systemd)
journalctl -u falco --since "2024-01-01" --no-pager | grep -E "Warning|Error|Notice" | tail -50

# Falco log file (common locations)
ls -la /var/log/falco* 2>/dev/null
cat /var/log/falco.log 2>/dev/null | grep -E "shell.*container|write.*bin|network.*unexpected" | tail -30

# If Falco is running, query active rules
falco --list 2>/dev/null | grep -E "shell|escape|privilege|network"

# Trigger a manual check against a running container's syscalls (requires Falco running)
# Falco will automatically alert — review output for violations
docker exec <suspicious-container> sh -c "id; hostname; cat /etc/passwd" 2>/dev/null

# Check Falco rule configuration for gaps (rules not covering lateral movement)
ls /etc/falco/rules.d/ 2>/dev/null
grep -r "shell_in_container\|write_below_root\|outbound_conn" /etc/falco/rules.d/ 2>/dev/null

Tetragon

Tetragon (Cilium's eBPF security enforcement) provides execution and network tracing at the process level.

# Tetragon logs via kubectl (Kubernetes deployment)
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout 2>/dev/null | \
  python3 -c "
import sys, json
for line in sys.stdin:
  try:
    ev = json.loads(line.strip())
    t = ev.get('process_exec') or ev.get('process_kprobe') or ev.get('process_exit')
    if t:
      proc = t.get('process', {})
      print(proc.get('start_time',''), proc.get('pod',{}).get('name',''), proc.get('binary',''), ' '.join(proc.get('arguments','').split()[:5]))
  except Exception:
    pass
" 2>/dev/null | tail -50

# Tetragon CLI — trace all process executions in a namespace
tetra getevents --namespace default 2>/dev/null | head -30

# Tetragon CLI — filter for specific suspicious binaries
tetra getevents 2>/dev/null | grep -E "bash|sh|wget|curl|nc|ncat|python" | tail -20

# Tetragon TracingPolicy — check what policies are active
kubectl get tracingpolicies -A 2>/dev/null

Tracee

Tracee (Aqua Security) uses eBPF to capture system call events and can detect container escape attempts in real time.

# Tracee logs if deployed as a container
docker logs tracee 2>/dev/null | grep -E "SUSPICIOUS|container_escape|privilege_escalation" | tail -30

# Run Tracee in event-filter mode against a specific container (forensic read — does not modify container)
# Replace <container-id> with the target container
docker run --rm --pid=host --cgroupns=host --privileged \
  -v /etc/os-release:/etc/os-release-host:ro \
  -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/tracee:latest trace \
  --filter container=<container-id> \
  --filter event=execve,connect,open,write \
  --output json 2>/dev/null | head -100

# Tracee event filtering for escape-relevant syscalls
docker run --rm --privileged --pid=host aquasec/tracee:latest trace \
  --filter event=ptrace,process_vm_writev,memfd_create,mount,unshare \
  --output json 2>/dev/null | head -50

When eBPF tools were active, their logs may be the most reliable record of attacker behavior — they are harder to tamper with from within a container than application logs. Absence of eBPF tooling on a production cluster is itself a gap to document.

8. Kubernetes-Specific Checks

If the host is a Kubernetes node, apply these additional checks.

# Kubernetes version and node info
kubectl version --short 2>/dev/null
kubectl get nodes -o wide 2>/dev/null

# Pods in all namespaces
kubectl get pods --all-namespaces -o wide 2>/dev/null

# Pods with host namespaces or privileged containers
kubectl get pods --all-namespaces -o json 2>/dev/null | \
  python3 -c "
import sys, json
data = json.load(sys.stdin)
for item in data['items']:
  name = item['metadata']['name']
  ns = item['metadata']['namespace']
  spec = item['spec']
  if spec.get('hostPID') or spec.get('hostNetwork') or spec.get('hostIPC'):
    print(f'{ns}/{name}: hostNamespace=True')
  for c in spec.get('containers', []):
    sc = c.get('securityContext', {})
    if sc.get('privileged'):
      print(f'{ns}/{name}/{c[\"name\"]}: privileged=True')
"

# ClusterRoleBindings — check for over-privileged service accounts
kubectl get clusterrolebindings -o wide 2>/dev/null | grep -v "^NAME"

# Secrets accessible to potentially compromised service accounts
kubectl get secrets --all-namespaces 2>/dev/null | head -30

# Recent events (shows pod creations, failures, evictions)
kubectl get events --all-namespaces --sort-by='.lastTimestamp' 2>/dev/null | tail -30

etcd Security Audit

etcd holds the entire cluster state, including Secrets in plaintext (unless encryption-at-rest is configured). Direct access to etcd bypasses Kubernetes RBAC entirely — any attacker who reaches etcd owns the cluster.

# Check if etcd is running and on which address
ps aux | grep etcd | grep -v grep
# Look for --listen-client-urls — should be 127.0.0.1, not 0.0.0.0

# Verify encryption-at-rest is configured (EncryptionConfiguration)
# On control plane nodes, check the API server manifest for --encryption-provider-config
cat /etc/kubernetes/manifests/kube-apiserver.yaml 2>/dev/null | \
  grep -E "encryption-provider-config|EncryptionConfig"

# Check if etcd requires client certificate authentication
# (--client-cert-auth=true should be set)
ps aux | grep etcd | grep -v grep | grep -o "\-\-client-cert-auth=[^ ]*"

# Enumerate etcd client certificates — identify any unauthorized certs
ls -la /etc/kubernetes/pki/etcd/ 2>/dev/null

# Verify TLS peer communication (peer-cert-file and peer-key-file set)
ps aux | grep etcd | grep -v grep | grep -o "\-\-peer-cert-file=[^ ]*"

# Take a read-only snapshot for offline analysis (requires etcdctl and valid certs)
ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /tmp/etcd-snapshot-$(date +%Y%m%d%H%M%S).db 2>/dev/null

# Examine snapshot for sensitive key paths (does not decrypt — shows key names only)
# Useful for detecting attacker-written keys or confirming secret enumeration
ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  get / --prefix --keys-only 2>/dev/null | grep -E "secrets|serviceaccount|token" | head -30

Encryption-at-rest absent combined with etcd exposure on a non-loopback address is a critical finding. Snapshot all etcd data for offline analysis before any remediation steps.

K8s API Server Audit Log Analysis

The API server audit log is the authoritative record of all cluster operations. It captures who made each request, what resource they accessed, and whether it was allowed. Audit logs are the cluster equivalent of host authentication logs.

# Locate the audit log — typically configured via --audit-log-path in the API server manifest
cat /etc/kubernetes/manifests/kube-apiserver.yaml 2>/dev/null | grep "audit-log-path"

# Common audit log location (varies by distribution)
ls -lh /var/log/kubernetes/audit/ 2>/dev/null
ls -lh /var/log/audit/ 2>/dev/null

# Parse JSON audit log — summarize by user and verb
cat /var/log/kubernetes/audit/audit.log 2>/dev/null | \
  python3 -c "
import sys, json
from collections import Counter
counts = Counter()
for line in sys.stdin:
  try:
    ev = json.loads(line.strip())
    user = ev.get('user', {}).get('username', 'unknown')
    verb = ev.get('verb', '')
    res = ev.get('objectRef', {}).get('resource', '')
    counts[(user, verb, res)] += 1
  except Exception:
    pass
for (u, v, r), c in counts.most_common(30):
  print(f'{c:5d}  {u:<40} {v:<10} {r}')
" 2>/dev/null

# Detect anonymous or unauthenticated API calls
grep '"username":"system:anonymous"' /var/log/kubernetes/audit/audit.log 2>/dev/null | \
  python3 -c "import sys,json; [print(json.loads(l).get('requestURI','')) for l in sys.stdin]" | \
  sort | uniq -c | sort -rn | head -20

# ServiceAccount token abuse — calls from service accounts outside expected namespaces
cat /var/log/kubernetes/audit/audit.log 2>/dev/null | \
  python3 -c "
import sys, json
for line in sys.stdin:
  try:
    ev = json.loads(line.strip())
    user = ev.get('user', {}).get('username', '')
    if 'system:serviceaccount' in user and ev.get('verb') in ('get','list','create','delete','patch'):
      ns = ev.get('objectRef', {}).get('namespace', '')
      sa_ns = user.split(':')[2] if ':' in user else ''
      if ns and sa_ns and ns != sa_ns and ns not in ('','kube-system'):
        print(f'{user} -> {ev[\"verb\"]} {ns}/{ev.get(\"objectRef\",{}).get(\"resource\",\"\")}')
  except Exception:
    pass
" 2>/dev/null | sort | uniq -c | sort -rn | head -20

# Detect secrets enumeration — list or get on secrets resources
grep '"resource":"secrets"' /var/log/kubernetes/audit/audit.log 2>/dev/null | \
  python3 -c "
import sys, json
for line in sys.stdin:
  try:
    ev = json.loads(line.strip())
    print(ev.get('requestReceivedTimestamp',''), ev.get('user',{}).get('username',''), ev.get('verb',''), ev.get('objectRef',{}).get('name',''))
  except Exception:
    pass
" 2>/dev/null | head -30

# Exec into pods — T1609 Container Administration Command
grep '"subresource":"exec"' /var/log/kubernetes/audit/audit.log 2>/dev/null | \
  python3 -c "
import sys, json
for line in sys.stdin:
  try:
    ev = json.loads(line.strip())
    print(ev.get('requestReceivedTimestamp',''), ev.get('user',{}).get('username',''), ev.get('objectRef',{}).get('namespace',''), ev.get('objectRef',{}).get('name',''))
  except Exception:
    pass
" 2>/dev/null | head -20

Patterns to flag as suspicious:

  • Requests from `system:anonymous` to anything beyond unauthenticated discovery endpoints
  • ServiceAccount tokens used from outside their home namespace
  • Bulk `list` or `get` on `secrets` resources — indicates credential harvesting
  • `exec` subresource calls from non-operator users during the incident window
  • Rapid sequences of `create` and `delete` on the same resource — attacker covering tracks