Skip to main content

OpenShift Single-Node at the Edge

Monitor and manage Single-Node OpenShift (SNO) deployments at edge locations with Expanso. Deploy Expanso directly on the OpenShift node to collect logs, monitor cluster health, and automate operations—all without requiring external infrastructure.

What is Single-Node OpenShift?

Single-Node OpenShift (SNO) is Red Hat's solution for running OpenShift in constrained edge environments where both control plane and worker capabilities run on a single physical or virtual machine.

Ideal for edge scenarios:

  • Confined physical spaces (retail stores, factories, remote sites)
  • Intermittent network connectivity to central data centers
  • Resource-constrained environments
  • Locations requiring zero-touch operations

OpenShift SNO minimum requirements:

  • vCPU: 8
  • RAM: 16 GB
  • Storage: 120 GB
Expanso Resource Usage

Expanso Edge runs as a lightweight container on your SNO node, requiring only 0.5 CPU, 64MB RAM, and 150MB disk—a tiny fraction of the node's total resources.


Why Use Expanso with Single-Node OpenShift?

Challenge: SNO deployments at edge locations need monitoring and log collection, but network connectivity may be intermittent.

Solution: Deploy Expanso on the SNO node itself to collect logs and metrics locally, then batch and send to central storage when connectivity is available.

Benefits:

  • Minimal footprint: Uses <1% of SNO node resources
  • Offline capable: Queues data when network is down
  • Automatic batching: Optimizes network usage
  • No external dependencies: Self-contained operation
  • Local deployment: Runs directly on the OpenShift node

Deploy Expanso on Single-Node OpenShift

Deploy Expanso Edge agent as a DaemonSet on your SNO cluster:

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: expanso-edge
namespace: expanso-system
spec:
selector:
matchLabels:
app: expanso-edge
template:
metadata:
labels:
app: expanso-edge
spec:
serviceAccountName: expanso-edge
hostNetwork: true
containers:
- name: expanso-edge
image: ghcr.io/expanso-io/expanso-edge:nightly
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CLUSTER_NAME
value: "sno-retail-001"
- name: LOCATION
value: "store-chicago-north"
volumeMounts:
- name: config
mountPath: /etc/expanso/pipeline.yaml
subPath: pipeline.yaml
- name: kubeconfig
mountPath: /root/.kube/config
subPath: config
volumes:
- name: config
configMap:
name: expanso-pipeline
- name: kubeconfig
secret:
secretName: expanso-kubeconfig

Collect OpenShift Logs

Stream logs from all pods in the SNO cluster to S3:

input:
subprocess:
name: oc
args:
- logs
- --all-containers=true
- --prefix=true
- --follow
- --all-namespaces
- --since=10m
codec: lines
restart_on_exit: true

pipeline:
processors:
# Parse oc log prefix: [namespace/pod/container] message
- mapping: |
root.raw_log = this
root.timestamp = now()

# Extract metadata from prefix
let parts = this.re_find_all("^\\[([^/]+)/([^/]+)/([^\\]]+)\\] (.*)$")
root.namespace = $parts.0.1
root.pod = $parts.0.2
root.container = $parts.0.3
root.message = $parts.0.4

# Add SNO cluster context
root.node_name = env("NODE_NAME")
root.cluster_name = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.deployment_type = "single-node-openshift"

output:
aws_s3:
bucket: edge-openshift-logs
path: 'sno/${! env("CLUSTER_NAME") }/${! timestamp_unix("2006-01-02") }/${! json("namespace") }.jsonl'
batching:
count: 1000
period: 5m
processors:
- archive:
format: concatenate

What this does:

  • Follows logs from all pods and containers
  • Parses namespace, pod, container metadata
  • Adds SNO-specific context (node, cluster, location)
  • Batches logs to minimize network usage
  • Writes to S3 organized by cluster and date

Monitor Cluster Health

Check SNO cluster health and send metrics to central monitoring:

input:
generate:
interval: 60s
mapping: |
root.check_time = now()
root.cluster = env("CLUSTER_NAME")

pipeline:
processors:
# Check node status
- command:
name: oc
args_mapping: '["get", "nodes", "-o", "json"]'

- mapping: |
root.nodes = content().parse_json().items
root.node_ready = this.nodes.all(n ->
n.status.conditions.any(c -> c.type == "Ready" && c.status == "True")
)
root.node_name = this.nodes.index(0).metadata.name

# Check cluster operators
- command:
name: oc
args_mapping: '["get", "clusteroperators", "-o", "json"]'

- mapping: |
root.operators = content().parse_json().items
root.degraded_operators = this.operators.filter(op ->
op.status.conditions.any(c -> c.type == "Degraded" && c.status == "True")
).map_each(op -> op.metadata.name)

root.all_operators_healthy = this.degraded_operators.length() == 0

# Check pod status across namespaces
- command:
name: oc
args_mapping: '["get", "pods", "--all-namespaces", "-o", "json"]'

- mapping: |
root.pods = content().parse_json().items
root.total_pods = this.pods.length()
root.running_pods = this.pods.filter(p -> p.status.phase == "Running").length()
root.failed_pods = this.pods.filter(p ->
p.status.phase == "Failed" || p.status.phase == "CrashLoopBackOff"
).map_each(p -> {
"namespace": p.metadata.namespace,
"name": p.metadata.name,
"phase": p.status.phase
})

# Aggregate health status
- mapping: |
root.health_report = {
"cluster": @cluster,
"location": env("LOCATION"),
"timestamp": @check_time,
"node_ready": @node_ready,
"operators_healthy": @all_operators_healthy,
"degraded_operators": @degraded_operators,
"total_pods": @total_pods,
"running_pods": @running_pods,
"failed_pods": @failed_pods,
"cluster_healthy": @node_ready && @all_operators_healthy && @failed_pods.length() == 0
}

output:
switch:
cases:
# Alert if unhealthy
- check: '!this.health_report.cluster_healthy'
output:
broker:
pattern: fan_out
outputs:
# Send alert
- http_client:
url: https://alerts.company.com/sno-health
verb: POST
headers:
Content-Type: application/json
# Log alert
- aws_s3:
bucket: sno-health-alerts
path: 'alerts/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.json'

# Normal health metrics
- output:
http_client:
url: https://metrics.company.com/sno-health
verb: POST
batching:
count: 10
period: 5m

Monitor Resource Usage

Track CPU, memory, and storage on the SNO node:

input:
generate:
interval: 60s
mapping: 'root = {}'

pipeline:
processors:
# Get node resource usage
- command:
name: oc
args_mapping: '["adm", "top", "node", "--no-headers"]'

- mapping: |
# Parse: node-name CPU(cores) CPU% MEMORY(bytes) MEMORY%
let parts = content().string().split_regex("\\s+")
root.node_name = $parts.0
root.cpu_cores = $parts.1
root.cpu_percent = $parts.2.trim("%").parse_float()
root.memory_bytes = $parts.3
root.memory_percent = $parts.4.trim("%").parse_float()
root.cluster = env("CLUSTER_NAME")
root.timestamp = now()

# Get pod resource usage
- command:
name: oc
args_mapping: '["adm", "top", "pods", "--all-namespaces", "--no-headers"]'

- mapping: |
# Parse pod metrics
root.pod_metrics = content().string().split("\n").filter(l -> l != "").map_each(line -> {
let parts = line.split_regex("\\s+")
{
"namespace": $parts.0,
"pod": $parts.1,
"cpu": $parts.2,
"memory": $parts.3
}
})

# Aggregate by namespace
root.namespace_usage = this.pod_metrics.fold({}, tally, namespace -> {
$tally.set(namespace.namespace, ($tally.get(namespace.namespace).or(0) + 1))
})

# Check for resource pressure
- mapping: |
root.resource_alert = {
"high_cpu": this.cpu_percent > 80,
"high_memory": this.memory_percent > 85,
"cluster": @cluster,
"timestamp": @timestamp
}

output:
broker:
pattern: fan_out
outputs:
# Send metrics
- http_client:
url: https://metrics.company.com/sno-resources
verb: POST
batching:
count: 20
period: 5m

# Alert on high resource usage
- switch:
cases:
- check: 'this.resource_alert.high_cpu || this.resource_alert.high_memory'
output:
http_client:
url: https://alerts.company.com/sno-resources
verb: POST

Collect Specific Application Logs

Focus on logs from specific namespaces (e.g., production apps):

input:
subprocess:
name: oc
args:
- logs
- --namespace=production
- --all-containers=true
- --prefix=true
- --follow
- --selector=app=point-of-sale
codec: lines
restart_on_exit: true

pipeline:
processors:
- mapping: |
# Parse and structure logs
root = this.parse_json().catch({
"message": this,
"level": "info"
})
root.cluster = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.app = "point-of-sale"
root.timestamp = now()

output:
broker:
pattern: fan_out
outputs:
# Real-time to Elasticsearch
- elasticsearch_v2:
urls: ['https://elasticsearch.company.com:9200']
index: 'sno-pos-logs-${! timestamp_unix("2006-01-02") }'
batching:
count: 100
period: 10s

# Archive to S3
- aws_s3:
bucket: sno-app-logs
path: 'pos/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.jsonl'
batching:
count: 5000
period: 10m

Offline-Resilient Configuration

Handle intermittent connectivity with retry and buffering:

input:
subprocess:
name: oc
args: [logs, --all-containers, --prefix, --follow, --all-namespaces]
codec: lines
restart_on_exit: true

pipeline:
processors:
- mapping: |
root = this
root.cluster = env("CLUSTER_NAME")
root.timestamp = now()

# Buffer for offline periods
buffer:
system_window:
timestamp_mapping: 'root = this.timestamp'
size: 1h

output:
retry:
max_retries: 10
backoff:
initial_interval: 30s
max_interval: 10m
output:
aws_s3:
bucket: sno-logs
path: 'logs/${! env("CLUSTER_NAME") }/${! timestamp_unix() }.jsonl'
batching:
count: 5000
period: 5m

What this does:

  • Buffers up to 1 hour of logs locally
  • Retries S3 writes up to 10 times
  • Uses exponential backoff (30s → 10m)
  • Queues logs during network outages
  • Automatically catches up when connectivity returns

Service Account Setup

Create RBAC for Expanso to read cluster data.

Service Account:

apiVersion: v1
kind: ServiceAccount
metadata:
name: expanso-edge
namespace: expanso-system

ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: expanso-edge-reader
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "nodes", "namespaces"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "daemonsets", "statefulsets"]
verbs: ["get", "list"]
- apiGroups: ["config.openshift.io"]
resources: ["clusteroperators"]
verbs: ["get", "list"]

ClusterRoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: expanso-edge-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: expanso-edge-reader
subjects:
- kind: ServiceAccount
name: expanso-edge
namespace: expanso-system

Apply all three with:

oc apply -f expanso-serviceaccount.yaml
oc apply -f expanso-clusterrole.yaml
oc apply -f expanso-clusterrolebinding.yaml

Best Practices for SNO

1. Resource Allocation

# Expanso uses minimal resources on your SNO node
resources:
requests:
cpu: 100m # 0.1 CPU cores
memory: 128Mi # 128 MB RAM
limits:
cpu: 500m # 0.5 CPU cores max
memory: 512Mi # 512 MB RAM max

These conservative limits ensure Expanso doesn't impact your application workloads.

2. Use Batch Processing

output:
aws_s3:
batching:
count: 5000 # Larger batches for SNO
period: 10m # Longer periods to reduce network

Reduces network overhead critical for edge deployments.

3. Filter Logs Early

pipeline:
processors:
# Only send WARN and ERROR logs
- switch:
cases:
- check: 'this.level.lowercase().contains_any(["warn", "error", "fatal"])'

Saves bandwidth and storage costs.

4. Add Location Context

processors:
- mapping: |
root.cluster_name = env("CLUSTER_NAME")
root.location = env("LOCATION")
root.deployment_type = "single-node-openshift"

Essential for multi-site deployments.


Troubleshooting

oc Command Not Found

Solution: Use full path or install OpenShift CLI in Expanso container:

# Add to Dockerfile
RUN curl -LO https://mirror.openshift.com/pub/openshift-v4/clients/ocp/stable/openshift-client-linux.tar.gz && \
tar -xzf openshift-client-linux.tar.gz -C /usr/local/bin oc

Permission Denied

Solution: Verify service account permissions:

oc auth can-i get pods --all-namespaces --as=system:serviceaccount:expanso-system:expanso-edge

High Resource Usage

Solution: Reduce log collection frequency and increase batching:

input:
subprocess:
args:
- logs
- --since=5m # Only last 5 minutes instead of all logs

output:
batching:
count: 10000 # Larger batches
period: 15m # Less frequent writes

Integration with OpenShift Logging

Expanso can complement OpenShift's built-in logging:

# Collect from OpenShift logging stack
input:
subprocess:
name: oc
args:
- logs
- --namespace=openshift-logging
- deployment/cluster-logging-operator
- --follow
codec: lines
restart_on_exit: true

Or forward to external systems that OpenShift logging doesn't support.


Next Steps


Additional Resources