Restart Events Not Tracked
When an edge process restarts, running executions skip the normal failure flow (Running → Failed → Degraded → retry) and jump directly to Running again.
Impact
Because restarts bypass the failure flow:
- Failure metrics aren't incremented
- Error details aren't recorded
- Retry counts don't update
- Monitoring dashboards miss these events
Workarounds
- Monitor edge process health — Track edge uptime and restart events separately from job metrics
- Add custom execution logging — Log progress checkpoints that survive restarts
- Track execution duration anomalies — Unusually long execution times may indicate silent restarts