Elevated job queue latency and increased error rates

Incident Report for Buildkite

Resolved

System performance have returned to normal
Posted Aug 03, 2023 - 02:31 UTC

Monitoring

Pausing the work had the desired effect and system performance has returned to normal. We're continuing to investigate why performance degraded in a workload that's been operating acceptably for some time.
Posted Aug 03, 2023 - 02:15 UTC

Identified

We have identified an asynchronous workload that was causing unreasonable database load on a key table and have paused the work
Posted Aug 03, 2023 - 02:09 UTC

Investigating

We are currently investigating this issue.
Posted Aug 03, 2023 - 01:58 UTC
This incident affected: Agent API and Job Queue.