Skip to main content

Restart Events Not Tracked

When an edge process restarts, running executions skip the normal failure flow (Running → Failed → Degraded → retry) and jump directly to Running again.

Impact

Because restarts bypass the failure flow:

  • Failure metrics aren't incremented
  • Error details aren't recorded
  • Retry counts don't update
  • Monitoring dashboards miss these events

Workarounds

  1. Monitor edge process health — Track edge uptime and restart events separately from job metrics
  2. Add custom execution logging — Log progress checkpoints that survive restarts
  3. Track execution duration anomalies — Unusually long execution times may indicate silent restarts