OpenShift Single-Node at the Edge
Monitor and manage Single-Node OpenShift (SNO) deployments at edge locations with Expanso. Deploy Expanso directly on the OpenShift node to collect logs, monitor cluster health, and automate operations—all without requiring external infrastructure.
What is Single-Node OpenShift?
Single-Node OpenShift (SNO) is Red Hat's solution for running OpenShift in constrained edge environments where both control plane and worker capabilities run on a single physical or virtual machine.
Ideal for edge scenarios:
- Confined physical spaces (retail stores, factories, remote sites)
- Intermittent network connectivity to central data centers
- Resource-constrained environments
- Locations requiring zero-touch operations
OpenShift SNO minimum requirements:
- vCPU: 8
- RAM: 16 GB
- Storage: 120 GB
Expanso Edge runs as a lightweight container on your SNO node, requiring only 0.5 CPU, 64MB RAM, and 150MB disk—a tiny fraction of the node's total resources.
Why Use Expanso with Single-Node OpenShift?
Challenge: SNO deployments at edge locations need monitoring and log collection, but network connectivity may be intermittent.
Solution: Deploy Expanso on the SNO node itself to collect logs and metrics locally, then batch and send to central storage when connectivity is available.
Benefits:
- Minimal footprint: Uses <1% of SNO node resources
- Offline capable: Queues data when network is down
- Automatic batching: Optimizes network usage
- No external dependencies: Self-contained operation
- Local deployment: Runs directly on the OpenShift node
Deploy Expanso on Single-Node OpenShift
Deploy Expanso Edge agent as a DaemonSet on your SNO cluster:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: expanso-edge
namespace: expanso-system
spec:
selector:
matchLabels:
app: expanso-edge
template:
metadata:
labels:
app: expanso-edge
spec:
serviceAccountName: expanso-edge
hostNetwork: true
containers:
- name: expanso-edge
image: ghcr.io/expanso-io/expanso-edge:nightly
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CLUSTER_NAME
value: "sno-retail-001"
- name: LOCATION
value: "store-chicago-north"
volumeMounts:
- name: config
mountPath: /etc/expanso/pipeline.yaml
subPath: pipeline.yaml
- name: kubeconfig
mountPath: /root/.kube/config
subPath: config
volumes:
- name: config
configMap:
name: expanso-pipeline
- name: kubeconfig
secret:
secretName: expanso-kubeconfig
Collect OpenShift Logs
Stream logs from all pods in the SNO cluster to S3:
input:
subprocess:
name: oc
args:
- logs
- --all-containers=true
- --prefix=true
- --follow
- --all-namespaces
- --since=10m
codec: lines
restart_on_exit: true
pipeline:
processors:
# Parse oc log prefix: [namespace/pod/container] message
- mapping: |
root.raw_log = this
root.timestamp = now()
# Extract metadata from prefix
let parts = this.re_find_all("^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$")
root.namespace = $parts.0.1
root.pod = $parts.0.2
root.container = $parts.0.3
root.message = $parts.0.4
# Add SNO cluster context
root.node_name = env("NODE_NAME")
root.cluster_name = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.deployment_type = "single-node-openshift"
output:
aws_s3:
bucket: edge-openshift-logs
path: 'sno/${! env("CLUSTER_NAME") }/${! timestamp_unix("2006-01-02") }/${! json("namespace") }.jsonl'
batching:
count: 1000
period: 5m
processors:
- archive:
format: concatenate
What this does:
- Follows logs from all pods and containers
- Parses namespace, pod, container metadata
- Adds SNO-specific context (node, cluster, location)
- Batches logs to minimize network usage
- Writes to S3 organized by cluster and date
Monitor Cluster Health
Check SNO cluster health and send metrics to central monitoring:
input:
generate:
interval: 60s
mapping: |
root.check_time = now()
root.cluster = env("CLUSTER_NAME")
pipeline:
processors:
# Check node status
- command:
name: oc
args_mapping: '["get", "nodes", "-o", "json"]'
- mapping: |
root.nodes = content().parse_json().items
root.node_ready = this.nodes.all(n ->
n.status.conditions.any(c -> c.type == "Ready" && c.status == "True")
)
root.node_name = this.nodes.index(0).metadata.name
# Check cluster operators
- command:
name: oc
args_mapping: '["get", "clusteroperators", "-o", "json"]'
- mapping: |
root.operators = content().parse_json().items
root.degraded_operators = this.operators.filter(op ->
op.status.conditions.any(c -> c.type == "Degraded" && c.status == "True")
).map_each(op -> op.metadata.name)
root.all_operators_healthy = this.degraded_operators.length() == 0
# Check pod status across namespaces
- command:
name: oc
args_mapping: '["get", "pods", "--all-namespaces", "-o", "json"]'
- mapping: |
root.pods = content().parse_json().items
root.total_pods = this.pods.length()
root.running_pods = this.pods.filter(p -> p.status.phase == "Running").length()
root.failed_pods = this.pods.filter(p ->
p.status.phase == "Failed" || p.status.phase == "CrashLoopBackOff"
).map_each(p -> {
"namespace": p.metadata.namespace,
"name": p.metadata.name,
"phase": p.status.phase
})
# Aggregate health status
- mapping: |
root.health_report = {
"cluster": @cluster,
"location": env("LOCATION"),
"timestamp": @check_time,
"node_ready": @node_ready,
"operators_healthy": @all_operators_healthy,
"degraded_operators": @degraded_operators,
"total_pods": @total_pods,
"running_pods": @running_pods,
"failed_pods": @failed_pods,
"cluster_healthy": @node_ready && @all_operators_healthy && @failed_pods.length() == 0
}
output:
switch:
cases:
# Alert if unhealthy
- check: '!this.health_report.cluster_healthy'
output:
broker:
pattern: fan_out
outputs:
# Send alert
- http_client:
url: https://alerts.company.com/sno-health
verb: POST
headers:
Content-Type: application/json
# Log alert
- aws_s3:
bucket: sno-health-alerts
path: 'alerts/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.json'
# Normal health metrics
- output:
http_client:
url: https://metrics.company.com/sno-health
verb: POST
batching:
count: 10
period: 5m
Monitor Resource Usage
Track CPU, memory, and storage on the SNO node:
input:
generate:
interval: 60s
mapping: 'root = {}'
pipeline:
processors:
# Get node resource usage
- command:
name: oc
args_mapping: '["adm", "top", "node", "--no-headers"]'
- mapping: |
# Parse: node-name CPU(cores) CPU% MEMORY(bytes) MEMORY%
let parts = content().string().split_regex("\\s+")
root.node_name = $parts.0
root.cpu_cores = $parts.1
root.cpu_percent = $parts.2.trim("%").parse_float()
root.memory_bytes = $parts.3
root.memory_percent = $parts.4.trim("%").parse_float()
root.cluster = env("CLUSTER_NAME")
root.timestamp = now()
# Get pod resource usage
- command:
name: oc
args_mapping: '["adm", "top", "pods", "--all-namespaces", "--no-headers"]'
- mapping: |
# Parse pod metrics
root.pod_metrics = content().string().split("\n").filter(l -> l != "").map_each(line -> {
let parts = line.split_regex("\\s+")
{
"namespace": $parts.0,
"pod": $parts.1,
"cpu": $parts.2,
"memory": $parts.3
}
})
# Aggregate by namespace
root.namespace_usage = this.pod_metrics.fold({}, tally, namespace -> {
$tally.set(namespace.namespace, ($tally.get(namespace.namespace).or(0) + 1))
})
# Check for resource pressure
- mapping: |
root.resource_alert = {
"high_cpu": this.cpu_percent > 80,
"high_memory": this.memory_percent > 85,
"cluster": @cluster,
"timestamp": @timestamp
}
output:
broker:
pattern: fan_out
outputs:
# Send metrics
- http_client:
url: https://metrics.company.com/sno-resources
verb: POST
batching:
count: 20
period: 5m
# Alert on high resource usage
- switch:
cases:
- check: 'this.resource_alert.high_cpu || this.resource_alert.high_memory'
output:
http_client:
url: https://alerts.company.com/sno-resources
verb: POST
Collect Specific Application Logs
Focus on logs from specific namespaces (e.g., production apps):
input:
subprocess:
name: oc
args:
- logs
- --namespace=production
- --all-containers=true
- --prefix=true
- --follow
- --selector=app=point-of-sale
codec: lines
restart_on_exit: true
pipeline:
processors:
- mapping: |
# Parse and structure logs
root = this.parse_json().catch({
"message": this,
"level": "info"
})
root.cluster = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.app = "point-of-sale"
root.timestamp = now()
output:
broker:
pattern: fan_out
outputs:
# Real-time to Elasticsearch
- elasticsearch_v2:
urls: ['https://elasticsearch.company.com:9200']
index: 'sno-pos-logs-${! timestamp_unix("2006-01-02") }'
batching:
count: 100
period: 10s
# Archive to S3
- aws_s3:
bucket: sno-app-logs
path: 'pos/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.jsonl'
batching:
count: 5000
period: 10m
Offline-Resilient Configuration
Handle intermittent connectivity with retry and buffering:
input:
subprocess:
name: oc
args: [logs, --all-containers, --prefix, --follow, --all-namespaces]
codec: lines
restart_on_exit: true
pipeline:
processors:
- mapping: |
root = this
root.cluster = env("CLUSTER_NAME")
root.timestamp = now()
# Buffer for offline periods
buffer:
system_window:
timestamp_mapping: 'root = this.timestamp'
size: 1h
output:
retry:
max_retries: 10
backoff:
initial_interval: 30s
max_interval: 10m
output:
aws_s3:
bucket: sno-logs
path: 'logs/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.jsonl'
batching:
count: 5000
period: 5m
What this does:
- Buffers up to 1 hour of logs locally
- Retries S3 writes up to 10 times
- Uses exponential backoff (30s → 10m)
- Queues logs during network outages
- Automatically catches up when connectivity returns
Service Account Setup
Create RBAC for Expanso to read cluster data.
Service Account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: expanso-edge
namespace: expanso-system
ClusterRole:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: expanso-edge-reader
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "nodes", "namespaces"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "daemonsets", "statefulsets"]
verbs: ["get", "list"]
- apiGroups: ["config.openshift.io"]
resources: ["clusteroperators"]
verbs: ["get", "list"]
ClusterRoleBinding:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: expanso-edge-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: expanso-edge-reader
subjects:
- kind: ServiceAccount
name: expanso-edge
namespace: expanso-system
Apply all three with:
oc apply -f expanso-serviceaccount.yaml
oc apply -f expanso-clusterrole.yaml
oc apply -f expanso-clusterrolebinding.yaml
Best Practices for SNO
1. Resource Allocation
# Expanso uses minimal resources on your SNO node
resources:
requests:
cpu: 100m # 0.1 CPU cores
memory: 128Mi # 128 MB RAM
limits:
cpu: 500m # 0.5 CPU cores max
memory: 512Mi # 512 MB RAM max
These conservative limits ensure Expanso doesn't impact your application workloads.
2. Use Batch Processing
output:
aws_s3:
batching:
count: 5000 # Larger batches for SNO
period: 10m # Longer periods to reduce network
Reduces network overhead critical for edge deployments.
3. Filter Logs Early
pipeline:
processors:
# Only send WARN and ERROR logs
- switch:
cases:
- check: 'this.level.lowercase().contains_any(["warn", "error", "fatal"])'
Saves bandwidth and storage costs.
4. Add Location Context
processors:
- mapping: |
root.cluster_name = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.deployment_type = "single-node-openshift"
Essential for multi-site deployments.
Troubleshooting
oc Command Not Found
Solution: Use full path or install OpenShift CLI in Expanso container:
# Add to Dockerfile
RUN curl -LO https://mirror.openshift.com/pub/openshift-v4/clients/ocp/stable/openshift-client-linux.tar.gz && \
tar -xzf openshift-client-linux.tar.gz -C /usr/local/bin oc
Permission Denied
Solution: Verify service account permissions:
oc auth can-i get pods --all-namespaces --as=system:serviceaccount:expanso-system:expanso-edge
High Resource Usage
Solution: Reduce log collection frequency and increase batching:
input:
subprocess:
args:
- logs
- --since=5m # Only last 5 minutes instead of all logs
output:
batching:
count: 10000 # Larger batches
period: 15m # Less frequent writes
Integration with OpenShift Logging
Expanso can complement OpenShift's built-in logging:
# Collect from OpenShift logging stack
input:
subprocess:
name: oc
args:
- logs
- --namespace=openshift-logging
- deployment/cluster-logging-operator
- --follow
codec: lines
restart_on_exit: true
Or forward to external systems that OpenShift logging doesn't support.
Next Steps
- K3s Logs: Similar patterns for K3s clusters
- Kubernetes Deployments: Deploy manifests to SNO
- Docker Compose: Manage containers alongside OpenShift
- Subprocess Input: Component reference