api.ListJobExecutionsResponse
items object[]
The actual items
EvalID is the ID of the evaluation that generated this execution
ID is the unique identifier of the execution (UUID)
JobID is the ID of the job that this execution is for
job_spec object
JobSpec is the parent job of the task being allocated. This is copied at execution time to avoid issues if the job definition is updated. TODO: Consider removing this field to reduce storage size, and populate it on-demand from the job store.
config object
Config contains type-specific configuration for the workload. The structure depends on the job type (e.g., pipeline config, query parameters).
Config contains type-specific configuration for the workload. The structure depends on the job type (e.g., pipeline config, query parameters).
Description is an optional human-readable description of the job.
labels object
Labels is used to associate arbitrary labels with this job. Labels can be used for filtering and selection.
meta object
Meta is used to associate arbitrary metadata with this job. Keys with the prefix "expanso.io/" are reserved for system use.
Name is the logical name of the job used to refer to it. Submitting a job with the same name as an existing job will result in an update to the existing job.
Namespace is the namespace this job is running in.
Priority defines the scheduling priority of this job. Higher values indicate higher priority.
RestartPolicy controls restart behavior when executions exit.
- "on-failure" (default): Restart on non-zero exit, complete on success
- "always": Restart on any exit (current daemon behavior)
- "never": No restart, one-shot execution
rollout object
Rollout defines how to rollout the job
Auto-promote canary rollouts
Canary-specific settings
Percentage of canary nodes
health_check object
HealthCheck defines health check configuration (required for rolling/canary, ignored for immediate)
Deadline is the maximum time to wait for an execution to become healthy (required)
Possible values: [-9223372036854776000, 9223372036854776000, 1, 1000, 1000000, 1000000000, 60000000000, 3600000000000, 10000000000]
FailureThreshold is the number of consecutive unhealthy intervals before the execution is considered unhealthy (optional, default: 3)
Interval is the duration of each health evaluation window (optional, default: 10s) Error rate is calculated per interval, not lifetime.
Possible values: [-9223372036854776000, 9223372036854776000, 1, 1000, 1000000, 1000000000, 60000000000, 3600000000000, 10000000000]
MaxErrorRate is the maximum error rate allowed during health checks (optional, default: 0.10) Pointer because we need to distinguish nil (use default) from explicit 0.0
SuccessThreshold is the number of consecutive healthy intervals before the execution is considered healthy (optional, default: 2)
MaxFailedNodes is the maximum number of failed nodes before stopping (optional, default: 10)
MaxFailedNodesPercent is the maximum percentage of failed nodes before stopping (optional, default: 10.0)
MaxParallel is the maximum percentage of nodes to update in parallel (0-100) For immediate strategy: this value is ignored (all nodes updated simultaneously) For rolling/canary: controls wave size as percentage of total nodes (default: 10 if not specified) Examples: 10 = 10% of nodes per wave, 50 = 50% of nodes per wave, 100 = all nodes at once
NoAutoRollback disables automatic rollback on rollout failure (default: false = auto-rollback enabled)
Strategy: immediate|rolling|canary
Possible values: [immediate, rolling, canary]
selector object
Selector defines which nodes to run the job on
MatchExpressions selects nodes using label selector expression strings. Each expression is evaluated independently and all must match (AND logic). Supported syntax:
- Equality: "key=value" or "key==value"
- Inequality: "key!=value"
- Set inclusion: "key in (value1,value2,...)"
- Set exclusion: "key notin (value1,value2,...)"
- Existence: "key"
- Non-existence: "!key" Examples:
- "region=us-east"
- "tier in (premium,standard)"
- "environment!=prod"
- "gpu"
- "!debug"
MatchIDs selects specific nodes by their IDs. If specified, the job will only run on nodes whose ID is in this list.
match_labels object
MatchLabels selects nodes with labels that exactly match all specified key-value pairs. All labels must match (AND logic). Example: {"region": "us-east", "tier": "compute"}
timeouts object
Timeouts defines timeout configurations for the job
ExecutionTimeout is the maximum amount of time a task is allowed to run in seconds. Zero means no timeout, such as for a daemon task.
QueueTimeout is the maximum amount of time a task is allowed to wait in the orchestrator queue in seconds before being scheduled. Zero means no timeout.
TotalTimeout is the maximum amount of time a task is allowed to complete in seconds. This includes the time spent in the queue, the time spent executing and the time spent retrying. Zero means no timeout.
Type specifies what kind of workload this job runs (e.g. "pipeline", "query", "update", "config"). The scheduling behavior is derived from this type.
JobType is the type of job this execution is for
JobVersion is the version of the job when this execution was created
Namespace is the namespace the execution is created in
NodeID is the node this execution is placed on
RolloutWave tracks which wave this execution was created in (1-indexed) 0 means no wave (immediate rollout or no rollout in progress)
status object
Status contains system-managed fields for the execution
CreatedAt is when the execution was created
desired_state object
DesiredState is what state the execution should be in
details object
Details is a map of additional details about the state.
Message is a human readable message describing the state.
StateType is the current state of the object.
Possible values: [pending, running, stopped]
details object
Details contains structured metadata about the current execution state (e.g., error codes, hints, component info from lib/errors)
NextExecutionID is used for tracking and triggering the next execution, such as during an update
observed_state object
ObservedState is the actual state of the execution on the node
details object
Details is a map of additional details about the state.
Message is a human readable message describing the state.
StateType is the current state of the object.
Possible values: [pending, starting, validating, running, degraded, completed, failed, lost, stopping, stopped]
PreviousExecutionID is used for tracking the previous execution in a sequence
Revision is incremented on any change to the execution (state changes, status updates, etc.)
StableAt is when the execution passed deployment validation and became stable. Once set, this execution will not trigger autonomous rollback if it fails later (preventing infinite rollback loops when a previously-validated execution is resurrected). Zero value means the execution hasn't been validated yet (still in validation window).
UpdatedAt is when the execution was last updated
Token for next page
{
"items": [
{
"id": "exec-789xyz456abc",
"job_id": "job-abc123xyz",
"job_type": "pipeline",
"job_version": 2,
"namespace": "production",
"node_id": "node-edge-us-west-01",
"status": {
"created_at": "2025-01-15T10:35:00Z",
"desired_state": {
"message": "",
"state_type": "run"
},
"observed_state": {
"message": "Pipeline processing logs",
"state_type": "running"
},
"revision": 5,
"updated_at": "2025-01-15T10:36:15Z"
}
}
],
"next_token": "string"
}