K3s Log Collection Best Practices
Configuration recommendations for reliable and efficient K3s log collection.
Always Add Node Identifiers
Include node context in every log to identify the source edge location:
pipeline:
processors:
- mapping: |
root.node_id = env("NODE_ID")
root.location = env("LOCATION")
root.cluster = env("CLUSTER_NAME")
Why: Essential for filtering logs by location when managing 100+ edge sites.
Set environment variables:
export NODE_ID="edge-site-42"
export LOCATION="chicago"
export CLUSTER_NAME="k3s-chicago"
Use Batching for Cloud Destinations
Batch logs before sending to S3, Elasticsearch, or HTTP endpoints:
output:
aws_s3:
bucket: logs
batching:
count: 1000 # Batch size
period: 1m # Max wait time
Why: Reduces API calls by 1000x, lowering costs and improving performance.
Recommended batch sizes:
- S3: 1000-5000 logs or 1-5 minutes
- Elasticsearch: 100-500 logs or 10-30 seconds
- HTTP: 100-1000 logs or 30-60 seconds
Set restart_on_exit: true
Always enable auto-restart for the kubectl subprocess:
input:
subprocess:
name: kubectl
restart_on_exit: true # Auto-restart if kubectl exits
Why: Ensures logs keep flowing if the kubectl process crashes or exits unexpectedly.
Handle Large Log Messages
Set maximum buffer size to prevent memory issues:
input:
subprocess:
name: kubectl
max_buffer: 1048576 # 1MB max per log line
Why: Some applications generate very large log messages (stack traces, JSON payloads). Without a limit, these can cause memory issues.
Recommended sizes:
- Standard logs: 524288 (512KB)
- Large logs: 1048576 (1MB)
- Very large: 2097152 (2MB)
Configure RBAC Permissions
Create a service account with minimal required permissions:
# Create service account
kubectl create serviceaccount expanso-logs
# Create role with log read permissions
kubectl create clusterrole log-reader \
--verb=get,list,watch \
--resource=pods,pods/log
# Bind role to service account
kubectl create clusterrolebinding expanso-logs \
--clusterrole=log-reader \
--serviceaccount=default:expanso-logs
Why: Follows principle of least privilege. Expanso only needs read access to logs, not write access to cluster resources.
Use the service account:
kubectl --as=system:serviceaccount:default:expanso-logs logs --follow
Filter Before Sending
Apply filters early in the pipeline to reduce downstream processing:
pipeline:
processors:
# Parse and filter FIRST
- mapping: |
root = this.parse_json().catch(deleted())
# Only keep errors
- switch:
cases:
- check: 'this.level == "error"'
processors:
- mapping: 'root = this'
# Then add metadata (only for filtered logs)
- mapping: |
root.node_id = env("NODE_ID")
Why: Filtering early reduces CPU, memory, and network usage for logs that will be discarded anyway.
Monitor Log Pipeline Health
Add a metrics output to track pipeline performance:
output:
broker:
pattern: fan_out
outputs:
- aws_s3:
bucket: logs
- http_client:
url: https://metrics.company.com
verb: POST
processors:
- metric:
type: counter
name: logs_processed
labels:
node_id: ${NODE_ID}
Why: Detect issues like log collection stopping, high error rates, or performance degradation.
Handle High-Volume Namespaces
For namespaces with very high log volume, use separate pipelines:
# High-volume namespace: aggressive filtering
expanso-edge run --config production-errors-only.yaml &
# Low-volume namespaces: collect everything
expanso-edge run --config staging-all-logs.yaml &
Why: Prevents high-volume namespaces from overwhelming the pipeline or hitting rate limits.
Use Connection Pooling for HTTP Outputs
Configure connection pooling for HTTP destinations:
output:
http_client:
url: https://logs.company.com/ingest
max_in_flight: 64 # Parallel requests
batching:
count: 500
period: 30s
Why: Improves throughput for HTTP-based log ingestion endpoints.
Troubleshooting Tips
Logs not appearing:
# Verify kubectl works
kubectl get pods --all-namespaces
# Check Expanso logs
expanso-edge run --config k3s-logs.yaml --log.level=debug
kubectl process exits:
- Check
restart_on_exit: trueis set - Verify kubeconfig is valid
- Check RBAC permissions
High memory usage:
- Reduce
max_buffersize - Add filtering to reduce log volume
- Increase batching period
Performance issues:
- Increase batch sizes
- Add filtering earlier in pipeline
- Use multiple parallel pipelines for different namespaces
Next Steps
- Subprocess Input: Full component reference
- Batching Guide: Deep dive into batching strategies
- Error Handling: Handle pipeline failures