Parse K3s Log Metadata
Extract structured metadata from kubectl log prefixes to enable filtering and searching by namespace, pod, and container.
Pipeline
input:
subprocess:
name: kubectl
args:
- logs
- --all-containers=true
- --prefix=true
- --follow
- --all-namespaces
codec: lines
restart_on_exit: true
pipeline:
processors:
# Parse kubectl log prefix: [namespace/pod/container] message
- mapping: |
root.raw_log = this
root.timestamp = now()
# Extract metadata from prefix
let parts = this.re_find_all("^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$")
root.namespace = $parts.0.1
root.pod = $parts.0.2
root.container = $parts.0.3
root.message = $parts.0.4
# Add context
root.node_id = env("NODE_ID")
root.location = env("LOCATION")
root.cluster = env("CLUSTER_NAME")
output:
aws_s3:
bucket: edge-k3s-logs
path: 'logs/${! env("NODE_ID") }/${! timestamp_unix("2006-01-02") }/${! json("namespace") }.jsonl'
batching:
count: 1000
period: 1m
What This Does
- Parses kubectl prefix: Extracts namespace, pod, and container from
[namespace/pod/container]format - Separates message: Stores the actual log message separately from metadata
- Adds location context: Includes node ID, location, and cluster name
- Organizes by namespace: S3 path includes namespace for easy filtering
Example Output
Input (kubectl log line):
[production/web-app-7d8f9c/app] Request processed in 45ms
Output (structured JSON):
{
"namespace": "production",
"pod": "web-app-7d8f9c",
"container": "app",
"message": "Request processed in 45ms",
"node_id": "edge-site-42",
"location": "chicago",
"cluster": "k3s-chicago",
"timestamp": "2024-11-09T10:30:45Z"
}
Regex Breakdown
^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$
^\\[- Match opening bracket at start([^/]+)- Capture namespace (everything before first/)/- Match separator([^/]+)- Capture pod name (everything before second/)/- Match separator([^\\]]+)- Capture container name (everything before])\\]- Match closing bracket(.*)$- Capture message (everything after bracket and space)
Use Cases
Search by namespace: Query S3 for all logs from production namespace
Filter by pod: Find all logs from a specific pod across time
Container-level debugging: Isolate logs from sidecar containers
Multi-cluster aggregation: Compare logs from same namespace across different edge locations
Next Steps
- Multiple Destinations: Send parsed logs to Elasticsearch for real-time search
- Filter by Log Level: Combine with log level filtering
- Best Practices: Learn about efficient log handling