Expanso Documentation

Expanso enables you to deploy and orchestrate data pipelines from edge to destination through a unified control plane. Process, filter, and transform data - and run AI models - before it hits your backend platforms.

Deploy intelligent data pipelines anywhere

Process data and run inference where it's generated. Cut costs by filtering at the source. Deploy in minutes with 200+ pre-built components.

Cost Reduction

Filter and sample at source - only send valuable data downstream

Edge AI & ML

Run inference models directly within your data pipelines

Built-In Governance

Mask PII and enforce policies before data leaves the edge

Platform Agnostic

Works with Snowflake, Datadog, Splunk, S3 - whatever you use

Start Free Trial Quick Start Guide →

What is Expanso?

Expanso is a managed platform for deploying intelligent data pipelines at the edge:

Edge-Native Architecture: Process data where it's generated - reduce bandwidth and latency.
Run AI/ML Models at the Source: Run AI/ML models directly in your data pipelines.
200+ Components: Pre-built inputs, processors, and outputs for any pipeline need.
Visual Pipeline Builder: Drag-and-drop interface (YAML also available).
Managed SaaS: Central control plane with automatic agent updates.
Enterprise Governance: PII masking, policy enforcement, and compliance audit trails.

Quick Start

Get your first pipeline running in 15 minutes:

Sign up for Expanso Cloud (free tier available)
Install an agent on your infrastructure
Build a pipeline using our visual builder or YAML

Popular Use Cases

Database Connectivity: Connect to MySQL, PostgreSQL, SQLite for edge analytics and caching.
Edge Infrastructure Management: Manage K3s clusters, Docker containers, collect logs, automate operations.
Log Processing: Filter debug logs, mask PII, route to multiple destinations.
IoT Data Aggregation: Aggregate sensor data at the edge, reducing volume by 90%.
Real-Time Alerting: Detect patterns and alert without backend latency.
Compliance & Privacy: Mask sensitive data before it leaves your network.

Core Features

Visual Pipeline Builder

Create sophisticated pipelines with drag-and-drop or YAML. Preview data transformations in real-time as you build.

Edge AI & ML

Execute machine learning models directly on streaming data as a native pipeline step.

Low-Latency Inference: Get predictions in milliseconds without a round-trip to the cloud.
Enrich Data: Add model outputs (like a risk score) to your events before routing.
Flexible Integration: Integrate with your existing ML workflows and model formats.

Powerful Transformations

Leverage 200+ built-in components for ingesting, transforming, and routing data:

Parse logs (JSON, syslog, CSV, regex)
Filter and sample to reduce volume
Mask PII automatically
Aggregate metrics in time windows
Enrich with lookups, APIs, or ML models
Route to multiple destinations

Enterprise Ready

Governance: PII detection, masking, and policy enforcement
Monitoring: Built-in metrics and health checks
Security: RBAC, SSO, audit trails
High Availability: Automatic failover and recovery

Example Pipeline

Here's a pipeline that processes application logs, masks sensitive data, and routes them to multiple destinations:

input:
  kafka:
    brokers: [kafka-broker:9092]
    topics: [transactions]

pipeline:
  processors:
    - mapping: |
        # Parse JSON logs
        root = this.parse_json()

    - model:
        onnx:
          model_path: /models/fraud_detector.onnx
          input_columns: ["transaction_amount", "user_location_risk"]
          output_columns: ["is_fraud_probability"]

    - mapping: |
        # Add risk score to the event
        root.fraud_score = this.is_fraud_probability

        # Mask sensitive user info
        root.user.email = this.user.email.redact_emails()
        root.user.ip = "REDACTED"

output:
  broker:
    pattern: fan_out
    outputs:
      # Send high-risk events to a security queue
      - switch:
          - check: this.fraud_score > 0.9
            output:
              kafka:
                brokers: [kafka-broker:9092]
                topic: high-risk-transactions

      # Send everything to S3 for archiving
      - aws_s3:
          bucket: transactions-archive
          path: "${! timestamp_unix() }-${! uuid_v4() }.json"

See more examples →

Documentation Guide

We've organized our documentation to help you at different stages of your journey:

🎓 Getting Started - Learn the Basics

New to Expanso? Start here! These hands-on tutorials will get you up and running quickly.

Quick Start Guide - Your first pipeline in 15 minutes
Core Concepts - Understand how Expanso works
Installation - Deploy agents on your infrastructure

🔧 How-To Guides - Solve Specific Problems

Already familiar with the basics? Find step-by-step solutions to common problems.

Most Popular:

📋 Component Reference - Browse Available Components

Need to look up a specific input, processor, or output? Browse our complete component catalog.

200+ Components - Complete catalog with configuration options

Quick Links by Task

I want to...

Deploy my first pipeline → Quick Start Guide
Remove sensitive data → Remove PII Guide
Send data to multiple systems → Fan-Out Pattern Guide
Validate data quality → Schema Enforcement Guide
Browse available components → Component Catalog

Frequently Asked Questions

What is Expanso and how does it work?

Expanso is a managed platform for deploying intelligent data pipelines at the edge. It processes data where it's generated - reducing bandwidth, latency, and costs. You deploy lightweight agents on your infrastructure, build pipelines using our visual builder or YAML, and control everything from a central SaaS platform.

Can I run AI/ML models directly in my data pipelines?

Yes! Expanso supports running ONNX, TensorFlow Lite, and other models as native pipeline steps. Execute low-latency inference on streaming data, enrich events with model outputs (like risk scores), and make decisions at the edge without cloud round-trips.

How many pre-built components are available?

Expanso provides 200+ pre-built components including inputs (Kafka, HTTP, files), processors (transformations, filtering, PII masking, aggregations), and outputs (S3, Snowflake, Datadog, Splunk). Browse the complete catalog in our Component Reference.

Do I need to write code to build pipelines?

No - use our drag-and-drop visual pipeline builder to create sophisticated pipelines without code. For advanced use cases, you can also write pipelines in YAML or use the Bloblang transformation language for complex data mappings.

How does Expanso help with data governance and compliance?

Expanso includes built-in governance features: automatic PII detection and masking, policy enforcement at the edge, RBAC, SSO integration, and comprehensive audit trails. Mask sensitive data before it ever leaves your network.

Community & Support

Email Support: [email protected]
Community: Join our Slack
GitHub: github.com/expanso-io/expanso
Issues: Report bugs or request features

Ready to get started? Jump to the Quick Start guide →

Deploy intelligent data pipelines anywhere

Cost Reduction

Edge AI & ML

Built-In Governance

Platform Agnostic

What is Expanso?​

Quick Start​

Popular Use Cases​

Core Features​

Visual Pipeline Builder​

Edge AI & ML​

Powerful Transformations​

Enterprise Ready​

Example Pipeline​

Documentation Guide​

🎓 Getting Started - Learn the Basics​

🔧 How-To Guides - Solve Specific Problems​

📋 Component Reference - Browse Available Components​

Quick Links by Task​

Frequently Asked Questions​

Community & Support​

What is Expanso?

Quick Start

Popular Use Cases

Core Features

Visual Pipeline Builder

Edge AI & ML

Powerful Transformations

Enterprise Ready

Example Pipeline

Documentation Guide

🎓 Getting Started - Learn the Basics

🔧 How-To Guides - Solve Specific Problems

📋 Component Reference - Browse Available Components

Quick Links by Task

Frequently Asked Questions

Community & Support