Deploy to Your First Edge Node
In the Getting Started tutorial, you ran both the orchestrator and edge node on your local machine. That's perfect for learning the basics, but production edge computing is different—you'll deploy jobs to remote machines with real network challenges, firewall rules, and intermittent connectivity.
In this tutorial, you'll set up a real edge node on a separate Linux server or VM, configure it to connect to your orchestrator across a network, and deploy a data processing job that continues working even when network connectivity is disrupted. By the end, you'll understand how Expanso's edge architecture handles the realities of distributed computing.
This tutorial takes about 30-40 minutes to complete.
Deploying as a Kubernetes Sidecar
If you're running workloads in Kubernetes, you can deploy Expanso Edge as a sidecar container alongside your existing application pods. This approach leverages standard Kubernetes patterns to collect logs, metrics, or other telemetry without managing separate edge servers.
Replace the placeholder values:
YOUR_BOOTSTRAP_TOKEN: the bootstrap token for your Expanso network.your-app-image: your application container image.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: your-app-image:latest
- name: expanso-edge
image: ghcr.io/expanso-io/expanso-edge:latest
args:
- bootstrap
- --token
- YOUR_BOOTSTRAP_TOKEN
- run
volumeMounts:
- name: expanso-cache
mountPath: /var/lib/expanso
volumes:
- name: expanso-cache
emptyDir: {}
- Prefer stdout/stderr: Kubernetes captures container stdout/stderr logs by default. Emitting logs there integrates with cluster logging drivers.
- Use a shared volume for file logs: If your app writes logs to files, mount them on an
emptyDir(orhostPathfor host-level logs) and point Expanso at those paths.
What You'll Learn
- How to prepare a remote Linux machine as an edge node
- How to install and configure the Expanso edge binary on a remote system
- How to set up network connectivity and authentication between nodes and orchestrator
- How to use node labels for targeted job deployment
- How to monitor edge node health and connectivity
- How to test and verify autonomous operation during network issues
- How to troubleshoot common connectivity problems
Prerequisites
Before starting, make sure you have:
- A running Expanso orchestrator (from the Getting Started tutorial)
- A separate Linux machine or VM for the edge node (Ubuntu 20.04+, Debian 11+, or similar)
- SSH access to the remote machine
- Network connectivity between the orchestrator and edge node
- Basic familiarity with Linux command line
- Firewall configuration access (if applicable)
You can use a cloud VM (AWS EC2, GCP Compute, Azure VM) or a physical server. The steps are identical. Expanso Edge has a minimal footprint (0.5 CPU, 64MB RAM, 150MB disk) and runs on virtually any modern hardware.
Step 1: Prepare Your Edge Machine
First, let's prepare your remote machine for Expanso.
SSH into your edge machine:
Verify system compatibility:
# Verify Linux kernel version (3.10+ required)
uname -r
# Check available disk space for buffering
df -h /var/lib
# Verify network connectivity to orchestrator
ping orchestrator.example.com
Create the Expanso data directory:
sudo mkdir -p /var/lib/expanso
sudo mkdir -p /etc/expanso
sudo chown $USER:$USER /var/lib/expanso /etc/expanso
This directory will store pipeline configurations, state, and buffered data when the node operates offline.
Following Linux filesystem hierarchy standards, /var/lib/expanso stores variable application data that persists across reboots. This is where buffered messages, local state, and temporary pipeline data live during normal operation and network outages.
Step 2: Install the Expanso Edge Binary
Now let's install the edge node software on your remote machine.
Download the latest edge binary:
curl -L https://github.com/expanso-io/expanso/releases/latest/download/expanso-edge-linux-amd64 \
-o /tmp/expanso-edge
chmod +x /tmp/expanso-edge
sudo mv /tmp/expanso-edge /usr/local/bin/expanso-edge
Verify the installation:
expanso-edge version
You should see output like:
Expanso Edge v0.8.0
Build: abc123def456
Go: 1.21.5
Create a systemd service (optional but recommended for production):
Download the service file:
sudo curl -o /etc/systemd/system/expanso-edge.service https://docs.expanso.io/examples/deployment/expanso-edge.service
sudo systemctl daemon-reload
We'll configure and start this service shortly, but having it defined now means your edge node will automatically restart if the machine reboots.
Step 3: Configure Network Connectivity
Edge nodes connect to the orchestrator via gRPC on port 9090. Let's ensure network connectivity works both ways.
On your orchestrator machine, verify the gRPC port is accessible:
# Check that the orchestrator is listening
sudo netstat -tulpn | grep 9090
You should see:
tcp6 0 0 :::9090 :::* LISTEN 12345/expanso-orch
Configure firewall rules on the orchestrator:
If you're running a firewall, you'll need to allow inbound connections on port 9090:
# For ufw (Ubuntu/Debian)
sudo ufw allow 9090/tcp
sudo ufw status
# For firewalld (RHEL/CentOS)
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload
Test connectivity from the edge node:
Back on your edge machine, verify you can reach the orchestrator:
# Test basic connectivity
nc -zv orchestrator.example.com 9090
# You should see:
# Connection to orchestrator.example.com 9090 port [tcp/*] succeeded!
If you're using cloud VMs, remember to configure security groups or network ACLs to allow traffic between the orchestrator and edge nodes. The edge node needs to initiate outbound connections to the orchestrator on port 9090.
Step 4: Generate a Bootstrap Token
Edge nodes authenticate to the orchestrator using bootstrap tokens during initial registration. Let's generate one.
On your local machine (where you run the Expanso CLI):
expanso token create \
--type=bootstrap \
--expires-in=1h \
--description="Production edge node in Seattle datacenter"
You'll get output like:
Bootstrap Token: ebt_v1_a4b5c6d7e8f9g0h1i2j3k4l5m6n7o8p9
Expires: 2025-10-20T16:30:00Z (in 1 hour)
Description: Production edge node in Seattle datacenter
Keep this token secure. It can only be used once to register a new node.
Copy this token—you'll need it in the next step.
Bootstrap tokens are short-lived (1 hour by default) and single-use. Once a node registers successfully, it receives long-term credentials (API keys or mTLS certificates) and no longer needs the bootstrap token. This limits the blast radius if a token is compromised.
Step 5: Configure the Edge Node
Now let's create a configuration file for your edge node with the bootstrap token and connection details.
Create the edge configuration file:
Download the template configuration:
sudo curl -o /etc/expanso/edge-config.yaml https://docs.expanso.io/examples/deployment/edge-config.yaml
Or view the configuration file
Important: Edit the file to customize:
node.hostname: Your edge node's hostnamenode.labels: Labels for job targeting (region, datacenter, environment, etc.)orchestrator.endpoint: Your orchestrator's addressorchestrator.bootstrap_token: The token you generated in Step 4
Let's break down the key sections:
Node identity: The hostname is human-friendly but doesn't need to be globally unique. The orchestrator will assign a unique UUID during registration. The labels are crucial—you'll use these to target jobs to specific nodes.
Orchestrator connection: Your edge node connects to the orchestrator at the specified endpoint using the bootstrap token for initial authentication. After registration, the token is replaced with long-term credentials.
Local configuration: The data_dir stores pipeline state and buffers messages during network outages. The max_buffer_size prevents the node from filling the disk if it's offline for extended periods.
Step 6: Start the Edge Node
With configuration in place, let's start the edge node and verify it connects successfully.
Start the edge node:
expanso-edge run --config=/etc/expanso/edge-config.yaml
Watch the output carefully. You should see a sequence like this:
INFO Starting Expanso Edge Node
INFO Node hostname: edge-seattle-01
INFO Labels: datacenter=seattle, environment=production, hardware=cpu, region=us-west
INFO Connecting to orchestrator at orchestrator.example.com:9090
INFO Registering with orchestrator using bootstrap token
INFO Registration successful, assigned node ID: node-a1b2c3d4e5f6
INFO Bootstrap token consumed, long-term credentials stored
INFO Credential file: /var/lib/expanso/.credentials
INFO Starting session: session-node-a1b2c3d4e5f6-1729438800
INFO Heartbeat: healthy
INFO Connected and ready to accept jobs
Perfect! Your edge node is now registered and connected. Let's verify from the orchestrator side.
In a new terminal on your local machine, check the orchestrator:
expanso node list
You should see your new edge node:
NODE ID HOSTNAME STATUS LABELS LAST SEEN
node-a1b2c3d4e5f6 edge-seattle-01 healthy datacenter=seattle,environment=production,... 2s ago
Get detailed node information:
expanso node describe node-a1b2c3d4e5f6
You'll see comprehensive details:
Node: node-a1b2c3d4e5f6
Hostname: edge-seattle-01
Status: healthy
Labels:
datacenter: seattle
environment: production
hardware: cpu
region: us-west
Connection:
Session: session-node-a1b2c3d4e5f6-1729438800
Last Heartbeat: 5 seconds ago
Uptime: 2 minutes
Capabilities:
OS: linux
Architecture: amd64
Agent Version: v0.8.0
Resources:
CPU Cores: 4
Memory: 8192 MB
Max Pipelines: 10
Jobs: 0 running
Excellent! Your edge node is fully operational.
For production deployments, use the systemd service we created earlier. Stop the foreground process (Ctrl+C) and start the service:
sudo systemctl enable expanso-edge
sudo systemctl start expanso-edge
sudo systemctl status expanso-edge
This ensures the edge node starts automatically on boot and restarts if it crashes.
Step 7: Deploy a Job to the Edge Node
Now let's deploy a real data processing job that targets your specific edge node using labels.
Create a job that processes syslog messages:
Download the example job configuration:
curl -o edge-syslog-processor.yaml https://docs.expanso.io/examples/deployment/edge-syslog-processor.yaml
Important: Edit the file to customize:
spec.selector.match_labels: Adjust to match your node labelsconfig.output.http_client.url: Your logging ingestion endpointconfig.output.http_client.headers.Authorization: Your authentication token
This job does several important things:
- Selective targeting: Only deploys to nodes with labels
region=us-westANDenvironment=production - Resilient processing: Reads local syslog, parses structured data, and filters for important messages
- Dual output: Sends to central logging but also buffers locally if the network is down
- Edge enrichment: Adds node identity to messages, which is crucial for multi-site deployments
Deploy the job:
expanso job deploy edge-syslog-processor.yaml
You'll see:
✓ Job 'syslog-processor' created
✓ Evaluation triggered
✓ Scheduling to matching nodes...
✓ Deployed to 1 node (node-a1b2c3d4e5f6)
Verify the job is running on your edge node:
expanso job executions syslog-processor
You should see:
Job: syslog-processor
Type: pipeline
Executions: 1
Node: node-a1b2c3d4e5f6 (edge-seattle-01)
Execution ID: exec-xyz789
Status: running
Version: 1
Started: 15 seconds ago
Health: healthy
The pipeline is now processing syslog messages on your edge node in real-time!
Step 8: Monitor Edge Node Connectivity
One of Expanso's key features is handling network disruptions gracefully. Let's monitor how the system tracks connectivity.
Watch real-time heartbeats:
# Stream node events
expanso node events node-a1b2c3d4e5f6 --follow
You'll see periodic heartbeat confirmations:
2025-10-20T15:45:00Z HEARTBEAT seq=120 status=healthy jobs=1
2025-10-20T15:45:30Z HEARTBEAT seq=121 status=healthy jobs=1
2025-10-20T15:46:00Z HEARTBEAT seq=122 status=healthy jobs=1
The seq number increments with each heartbeat. A gap in sequence numbers indicates missed heartbeats due to network issues.
Check connection details:
expanso node connection node-a1b2c3d4e5f6
You'll see connection statistics:
Node: node-a1b2c3d4e5f6
Status: connected
Current Session:
Session ID: session-node-a1b2c3d4e5f6-1729438800
Started: 10 minutes ago
Heartbeats Sent: 20
Heartbeats Missed: 0
Last Seen: 5 seconds ago
Connection Quality:
Latency (avg): 12ms
Latency (p95): 18ms
Packet Loss: 0.0%
Lifetime Statistics:
Total Sessions: 1
Total Uptime: 10 minutes
Total Downtime: 0 seconds
This gives you insight into network quality and node reliability.
In production, export these metrics to Prometheus or your monitoring system. Expanso exposes a /metrics endpoint on the orchestrator that includes per-node connectivity metrics.
Step 9: Test Network Partition Scenarios
Now let's simulate real-world network issues and verify that your edge node continues operating autonomously.
Scenario 1: Brief Network Interruption
Temporarily block network connectivity from your edge node:
# On the edge machine
sudo iptables -A OUTPUT -p tcp --dport 9090 -j DROP
Watch what happens:
On the orchestrator, monitor the node:
expanso node events node-a1b2c3d4e5f6 --follow
After 3 missed heartbeats (about 90 seconds), you'll see:
2025-10-20T15:47:00Z HEARTBEAT seq=125 status=healthy jobs=1
2025-10-20T15:49:30Z MISSED_HEARTBEAT expected_seq=128 status=degraded
2025-10-20T15:50:00Z MISSED_HEARTBEAT expected_seq=129 status=degraded
2025-10-20T15:50:30Z NODE_DISCONNECTED missed=6 status=offline
The node is marked offline, but here's the critical part: the pipeline keeps running on the edge node. The local buffering mechanism activates, storing processed syslog messages to /var/lib/expanso/buffer/ until connectivity returns.
Restore connectivity:
# On the edge machine
sudo iptables -D OUTPUT -p tcp --dport 9090 -j DROP
Within seconds, you'll see the node reconnect:
2025-10-20T15:51:00Z RECONNECTED session=session-node-a1b2c3d4e5f6-1729438800 seq=130
2025-10-20T15:51:00Z HEARTBEAT seq=130 status=healthy jobs=1
Notice it's the same session—the node didn't restart, it reconnected. The buffered messages now flush to the central logging system.
Scenario 2: Extended Network Outage
For longer outages (>5 minutes), the edge node creates a new session when connectivity returns. Let's test this:
# On the edge machine, block connectivity for 6 minutes
sudo iptables -A OUTPUT -p tcp --dport 9090 -j DROP
sleep 360
sudo iptables -D OUTPUT -p tcp --dport 9090 -j DROP
When the node reconnects, you'll see:
2025-10-20T16:00:00Z NEW_SESSION session=session-node-a1b2c3d4e5f6-1729439200 reason=prolonged_disconnection
2025-10-20T16:00:00Z HEARTBEAT seq=0 status=healthy jobs=1
The sequence number reset to 0, indicating a new session. The orchestrator synchronizes state and verifies that the job is still running correctly.
Edge nodes maintain their identity (node ID) across sessions. Sessions are logical operational periods, not tied to process lifetime. This design allows the system to track and correlate events while handling network realities.
Step 10: Verify Autonomous Operation
Let's verify that your edge node truly operates independently during network outages.
Check local buffer during outage:
While connectivity is blocked, SSH to your edge node and check the local buffer:
# On the edge machine
ls -lh /var/lib/expanso/buffer/
# You should see files like:
# -rw-r--r-- 1 expanso expanso 2.1M Oct 20 16:05 syslog-1729439100.jsonl
# -rw-r--r-- 1 expanso expanso 1.8M Oct 20 16:06 syslog-1729439160.jsonl
These are the messages buffered during the outage. View their contents:
tail -5 /var/lib/expanso/buffer/syslog-*.jsonl
You'll see properly formatted, processed syslog entries:
{"timestamp":"Oct 20 16:05:45","hostname":"edge-seattle-01","program":"systemd","message":"Started Daily apt download activities.","node_id":"node-a1b2c3d4e5f6","node_hostname":"edge-seattle-01","ingested_at":"2025-10-20T16:05:45Z"}
{"timestamp":"Oct 20 16:06:02","hostname":"edge-seattle-01","program":"kernel","message":"warning: CPU throttling detected","node_id":"node-a1b2c3d4e5f6","node_hostname":"edge-seattle-01","ingested_at":"2025-10-20T16:06:02Z"}
The pipeline continued processing data locally, even without orchestrator connectivity!
Monitor buffer flush after reconnection:
When connectivity returns, watch the buffer directory:
watch -n 2 'ls -lh /var/lib/expanso/buffer/ | tail -5'
You'll see files disappearing as messages are sent to the central logging system:
total 8.2M
-rw-r--r-- 1 expanso expanso 2.1M Oct 20 16:05 syslog-1729439100.jsonl
-rw-r--r-- 1 expanso expanso 1.8M Oct 20 16:06 syslog-1729439160.jsonl
# ... files gradually delete as they're flushed ...
total 0
This demonstrates Expanso's edge-first architecture: process data locally, sync when possible, never lose data.
Verification Checklist
Let's verify everything is working correctly:
- ✅ Edge node is installed on a separate Linux machine
- ✅ Network connectivity to orchestrator is configured (firewall, security groups)
- ✅ Edge node successfully registered using bootstrap token
- ✅ Long-term credentials are stored and bootstrap token is consumed
- ✅ Node appears as "healthy" in
expanso node list - ✅ Labels are correctly configured and visible in node details
- ✅ Job deployed successfully to edge node based on label selectors
- ✅ Pipeline is processing data (syslog messages)
- ✅ Heartbeats are consistently received (check sequence numbers)
- ✅ Network interruptions are handled gracefully
- ✅ Data buffers locally during outages
- ✅ Buffered data flushes when connectivity returns
- ✅ Node reconnects automatically after network issues
If all items are checked, congratulations! You have a production-ready edge deployment.
What You Learned
You've accomplished a lot in this tutorial:
- ✅ Set up a production edge node on a remote Linux machine
- ✅ Configured network connectivity and firewall rules between orchestrator and edge
- ✅ Used bootstrap tokens for secure initial registration
- ✅ Configured node labels for targeted job deployment
- ✅ Deployed a real-world data processing pipeline (syslog processing)
- ✅ Monitored node health and connection quality
- ✅ Tested network partition scenarios and verified autonomous operation
- ✅ Confirmed local buffering during outages and synchronization on recovery
Key Concepts
Bootstrap Tokens: Short-lived, single-use credentials for initial node registration. After registration, nodes receive long-term credentials (API keys or mTLS certificates) that don't expire.
Node Labels: Key-value pairs attached to nodes (like region=us-west, environment=production) used by job selectors to control where jobs run. Labels are metadata, not security boundaries.
Heartbeats: Periodic health reports from edge nodes to the orchestrator (every 30 seconds by default). The orchestrator uses heartbeats to track connectivity, session continuity, and job health.
Session Continuity: Edge nodes maintain logical sessions that persist across brief network interruptions. New sessions start after extended outages or process restarts, but node identity remains constant.
Local Buffering: When network connectivity is lost, edge nodes buffer processed data locally (up to max_buffer_size). When connectivity returns, buffered data automatically syncs to remote destinations.
Autonomous Operation: Edge nodes continue processing data during network outages without orchestrator connectivity. The orchestrator tracks desired state but doesn't need to be reachable for pipelines to function.
Want to understand the architecture behind these features? Read:
Next Steps
Now that you have a production edge deployment, here's where to go next:
Scale Your Deployment:
Advanced Job Configuration:
Production Operations:
Architecture Deep Dives:
Troubleshooting
Edge Node Can't Connect to Orchestrator
Symptom: Edge node logs show connection errors or timeouts.
Diagnosis:
# On edge node, test connectivity
nc -zv orchestrator.example.com 9090
# Check DNS resolution
nslookup orchestrator.example.com
# Verify routing
traceroute orchestrator.example.com
Common Causes:
- Firewall blocking traffic: Verify firewall rules allow outbound TCP on port 9090
- Security groups (cloud): Check cloud security group rules
- DNS issues: Verify hostname resolves correctly
- NAT/routing: Ensure network route exists between edge and orchestrator
Solution:
# On orchestrator, verify listening
sudo netstat -tulpn | grep 9090
# Check firewall (Ubuntu/Debian)
sudo ufw status
sudo ufw allow 9090/tcp
# Check firewall (RHEL/CentOS)
sudo firewall-cmd --list-all
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload
Bootstrap Token Authentication Failed
Symptom: Node logs show "invalid bootstrap token" or "token expired".
Diagnosis:
# Check token expiration
expanso token list
# Verify token hasn't been used
expanso token describe <token-id>
Common Causes:
- Token expired: Bootstrap tokens have short lifespans (1 hour default)
- Token already used: Each token can only register one node
- Token typo: Copy-paste errors in configuration file
Solution:
# Generate new token
expanso token create --type=bootstrap --expires-in=1h
# Update edge configuration with new token
nano /etc/expanso/edge-config.yaml
# Restart edge node
sudo systemctl restart expanso-edge
Node Shows "Offline" Despite Running
Symptom: Edge node process is running but orchestrator shows status as "offline".
Diagnosis:
# Check edge node logs
sudo journalctl -u expanso-edge -f
# Verify heartbeat messages
expanso node events <node-id> --follow
# Check network connectivity
nc -zv orchestrator.example.com 9090
Common Causes:
- Heartbeat timeout: Network latency causing heartbeats to arrive late
- Clock skew: System clocks out of sync between edge and orchestrator
- gRPC connection issues: TLS handshake failures
Solution:
# Sync system clock (edge node)
sudo ntpdate -s time.nist.gov
# or
sudo timedatectl set-ntp true
# Check TLS certificate validity (if using custom CA)
openssl s_client -connect orchestrator.example.com:9090
# Increase heartbeat interval in config
# /etc/expanso/edge-config.yaml:
# local:
# heartbeat_interval: 60s # Increase from 30s
sudo systemctl restart expanso-edge
Job Deployed But Not Running on Edge Node
Symptom: Job shows as deployed but execution status is "pending" or "failed".
Diagnosis:
# Check job executions
expanso job executions <job-name>
# Get detailed execution logs
expanso execution logs <execution-id>
# Verify node capabilities
expanso node describe <node-id>
Common Causes:
- Label mismatch: Job selector doesn't match node labels
- Resource constraints: Node doesn't have required CPU/memory
- Missing dependencies: Job requires unsupported inputs/outputs
- Configuration errors: Invalid pipeline configuration
Solution:
# Verify label matching
expanso node list --show-labels
expanso job describe <job-name> --show-selector
# Check execution details for errors
expanso execution describe <execution-id>
# View agent logs on edge node
sudo journalctl -u expanso-edge -f | grep <job-name>
# Validate job configuration locally
expanso job validate <job-file.yaml>
Buffered Data Not Flushing After Reconnection
Symptom: Buffer files remain on disk after connectivity returns.
Diagnosis:
# Check buffer directory (on edge node)
ls -lh /var/lib/expanso/buffer/
# Verify output destination is reachable
curl -I https://logs.example.com/ingest
# Check pipeline logs
sudo journalctl -u expanso-edge -f | grep fallback
Common Causes:
- Destination unreachable: Remote endpoint still down
- Authentication issues: Credentials expired or invalid
- Rate limiting: Remote service throttling requests
- Disk full: No space for temporary files during flush
Solution:
# Test destination manually
curl -X POST https://logs.example.com/ingest \
-H "Authorization: Bearer $TOKEN" \
-d '{"test": "message"}'
# Check disk space
df -h /var/lib/expanso
# Manually trigger flush (restart edge service)
sudo systemctl restart expanso-edge
# Increase flush rate in config if destination can handle it
# config.output.http_client.max_in_flight: 50
Need More Help?
If you're still experiencing issues:
-
Check the logs: Detailed logs are your best diagnostic tool
# Orchestrator logs
expanso orchestrator logs --tail=100
# Edge node logs
sudo journalctl -u expanso-edge -n 100 -
Enable debug logging: Temporarily increase verbosity
# /etc/expanso/edge-config.yaml
logging:
level: debug -
Community support:
- Search GitHub Issues
- Ask on Discord
-
File a bug report:
- Include logs from both orchestrator and edge node
- Describe steps to reproduce
- Share configuration files (redact secrets)