Skip to main content

Parse K3s Log Metadata

Extract structured metadata from kubectl log prefixes to enable filtering and searching by namespace, pod, and container.

Pipeline

input:
subprocess:
name: kubectl
args:
- logs
- --all-containers=true
- --prefix=true
- --follow
- --all-namespaces
codec: lines
restart_on_exit: true

pipeline:
processors:
# Parse kubectl log prefix: [namespace/pod/container] message
- mapping: |
root.raw_log = this
root.timestamp = now()

# Extract metadata from prefix
let parts = this.re_find_all("^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$")
root.namespace = $parts.0.1
root.pod = $parts.0.2
root.container = $parts.0.3
root.message = $parts.0.4

# Add context
root.node_id = env("NODE_ID")
root.location = env("LOCATION")
root.cluster = env("CLUSTER_NAME")

output:
aws_s3:
bucket: edge-k3s-logs
path: 'logs/${! env("NODE_ID") }/${! timestamp_unix("2006-01-02") }/${! json("namespace") }.jsonl'
batching:
count: 1000
period: 1m

What This Does

  • Parses kubectl prefix: Extracts namespace, pod, and container from [namespace/pod/container] format
  • Separates message: Stores the actual log message separately from metadata
  • Adds location context: Includes node ID, location, and cluster name
  • Organizes by namespace: S3 path includes namespace for easy filtering

Example Output

Input (kubectl log line):

[production/web-app-7d8f9c/app] Request processed in 45ms

Output (structured JSON):

{
"namespace": "production",
"pod": "web-app-7d8f9c",
"container": "app",
"message": "Request processed in 45ms",
"node_id": "edge-site-42",
"location": "chicago",
"cluster": "k3s-chicago",
"timestamp": "2024-11-09T10:30:45Z"
}

Regex Breakdown

^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$
  • ^\\[ - Match opening bracket at start
  • ([^/]+) - Capture namespace (everything before first /)
  • / - Match separator
  • ([^/]+) - Capture pod name (everything before second /)
  • / - Match separator
  • ([^\\]]+) - Capture container name (everything before ])
  • \\] - Match closing bracket
  • (.*)$ - Capture message (everything after bracket and space)

Use Cases

Search by namespace: Query S3 for all logs from production namespace

Filter by pod: Find all logs from a specific pod across time

Container-level debugging: Isolate logs from sidecar containers

Multi-cluster aggregation: Compare logs from same namespace across different edge locations

Next Steps