Skip to main content

Collect OpenShift Logs

Stream logs from all pods and namespaces in your SNO cluster to S3 with structured metadata.

Pipeline

input:
subprocess:
name: oc
args:
- logs
- --all-containers=true
- --prefix=true
- --follow
- --all-namespaces
- --since=10m
codec: lines
restart_on_exit: true

pipeline:
processors:
# Parse oc log prefix: [namespace/pod/container] message
- mapping: |
root.raw_log = this
root.timestamp = now()

# Extract metadata from prefix
let parts = this.re_find_all("^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$")
root.namespace = $parts.0.1
root.pod = $parts.0.2
root.container = $parts.0.3
root.message = $parts.0.4

# Add SNO cluster context
root.node_name = env("NODE_NAME")
root.cluster_name = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.deployment_type = "single-node-openshift"

output:
aws_s3:
bucket: edge-openshift-logs
path: 'sno/${! env("CLUSTER_NAME") }/${! timestamp_unix("2006-01-02") }/${! json("namespace") }.jsonl'
batching:
count: 1000
period: 5m
processors:
- archive:
format: concatenate

What This Does

  • Follows logs from all containers in all namespaces using oc logs --follow
  • Parses metadata: Extracts namespace, pod, and container from log prefix
  • Adds SNO context: Includes node name, cluster identifier, and physical location
  • Batches logs: Collects 1000 logs or waits 5 minutes before writing
  • Organizes by namespace: S3 path includes namespace for easy filtering
  • Auto-restarts: If oc process exits, it automatically restarts

Example Output

Input (oc log line):

[production/web-app-7d8f9c/app] Request processed in 45ms

Output (structured JSON):

{
"namespace": "production",
"pod": "web-app-7d8f9c",
"container": "app",
"message": "Request processed in 45ms",
"node_name": "sno-retail-001",
"cluster_name": "sno-retail-001",
"location": "store-chicago-north",
"deployment_type": "single-node-openshift",
"timestamp": "2024-11-12T10:30:45Z"
}

Key Arguments

--all-containers=true: Includes logs from all containers in each pod

--prefix=true: Adds [namespace/pod/container] prefix for parsing

--follow: Continuously streams new logs (like tail -f)

--since=10m: Only collect logs from last 10 minutes (reduces initial load)

Environment Variables

Set these where Expanso runs:

  • NODE_NAME: Automatically set by Kubernetes (from pod spec)
  • CLUSTER_NAME: Unique SNO cluster identifier
  • LOCATION: Physical location identifier

Next Steps