Lineage Configuration
The lineage block on an edge configuration enables native OpenLineage event emission. When enabled, the edge agent emits a signed OpenLineage RunEvent at every pipeline lifecycle transition (START, COMPLETE, FAIL, ABORT) to a configurable backend.
When lineage.enabled is false (or the block is omitted), the edge starts no goroutines, opens no connections, and incurs zero overhead.
For a walkthrough of enabling lineage end-to-end, see the lineage how-to guide. For the event format and signature contract, see Verify Lineage Events.
Minimal configuration
The smallest config that emits events to a Marquez instance on the same host:
lineage:
enabled: true
transport: http
http:
endpoint: http://localhost:5000/api/v1/lineage
Complete configuration
lineage:
enabled: true
transport: http # "http" or "file"
queue_size: 1024 # bounded buffer between emit and worker
drain_timeout: 5s # max wait at edge shutdown
http:
endpoint: https://lineage.example.com/api/v1/lineage
timeout: 3s
auth:
type: bearer
token_env: OPENLINEAGE_TOKEN
file:
path: /var/lib/expanso/lineage/events.jsonl
rotation_size_mb: 64
Field reference
Top level
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable lineage emission. When false, no events are emitted and no transport is constructed. |
transport | string | — | Required when enabled. One of http or file. |
queue_size | int | 1024 | Bounded buffer between Emit() and the delivery worker. Events drop on overflow; lineage_events_dropped_total increments. Must be ≥ 0. |
drain_timeout | duration | 5s | Maximum time the edge waits for the worker to flush in-flight events at shutdown. Go duration string. |
http | object | — | HTTP transport configuration. Used when transport: http. |
file | object | — | File transport configuration. Used when transport: file. |
http
| Field | Type | Default | Description |
|---|---|---|---|
endpoint | string | — | Required. The OpenLineage-compatible URL to POST events to, typically <marquez>/api/v1/lineage. |
timeout | duration | 3s | Per-request HTTP timeout covering DNS, TCP, TLS, request body, and full response read. |
auth | object | — | Optional. Authentication configuration. When omitted, no Authorization header is sent. |
http.auth
| Field | Type | Default | Description |
|---|---|---|---|
type | string | — | Required when auth is set. Only bearer is supported. |
token_env | string | — | Required when type is set. Name of the environment variable holding the bearer token. Read once at edge startup. |
file
| Field | Type | Default | Description |
|---|---|---|---|
path | string | — | Required. Absolute path to the active events file. Each line is one JSON event. |
rotation_size_mb | int | 64 | Rotate the file once it exceeds this size in MB. 0 disables rotation. Must be ≥ 0. |
Validation rules
The edge validates lineage config at startup. If enabled is false, all other fields are accepted as-is (validation is skipped entirely).
When enabled is true:
transportmust be exactlyhttporfile.queue_sizemust be ≥ 0.- If
transport: http:http.endpointmust be non-empty.- If
http.auth.typeis set, it must bebearer. - If
http.auth.typeis set,http.endpointmust start withhttps://. The edge rejects HTTP endpoints when auth is configured, to prevent sending a bearer token in cleartext. - If
http.auth.typeis set,http.auth.token_envmust be non-empty.
- If
transport: file:file.pathmust be non-empty and absolute.file.rotation_size_mbmust be ≥ 0.
Invalid configuration causes startup failure with a message identifying the offending field.
Authentication
The HTTP transport reads the bearer token once at edge startup, from the environment variable named in http.auth.token_env. Rotating the token requires restarting the edge.
For tokens stored in Vault, AWS Secrets Manager, or GCP Secret Manager, resolve the secret out-of-band before edge startup and export it under the configured env var name. With systemd, the standard pattern is an EnvironmentFile= directive populated by a fetch script run before the edge service unit starts.
Defaults summary
The edge supplies defaults for fields that have a value in the table above. The most important ones:
queue_size: 1024— sized for typical lifecycle event throughput; raise it if pipelines transition faster than the worker can drain.drain_timeout: 5s— covers most graceful shutdowns; raise it for slow backends.http.timeout: 3s— short, because the pipeline does not wait on lineage emission; failed events drop and increment the counter rather than retrying.file.rotation_size_mb: 64— keeps single files small enough to ship through standard log-rotation tooling. Set to0to disable rotation entirely.
Related
- Lineage how-to guide — end-to-end walkthrough including Marquez setup.
- Verify Lineage Events — signature verification recipe.
- Metadata processor — attach OpenLineage-aligned identity fields to message bodies.
- Data lineage use case — audit, compliance, and governance scenarios.