Slow Response
Incident Report for Buildkite
Postmortem

This was one of five incidents with a common root cause. The post mortem is available here.

Posted Jun 23, 2022 - 07:32 UTC

Resolved
We have been advised by AWS that the root cause has been fixed and performance has returned to normal. This incident is now resolved.
Posted Apr 22, 2022 - 04:52 UTC
Monitoring
We are seeing performance return to normal and seeing builds running again. We are confirming the root cause has been fixed.
Posted Apr 22, 2022 - 04:19 UTC
Update
We are in constant communication with AWS. The internal service team has identified the issue is with EBS storage and is working as fast as possible to resolve the issue.
Posted Apr 22, 2022 - 03:38 UTC
Update
We are continuing to work with the vendor to mitigate the issue. We are also continuing to investigate other mitigation options.
Posted Apr 22, 2022 - 03:01 UTC
Update
We are continuing to work with the vendor to mitigate the issue. We are also continuing to investigate other mitigation options.
Posted Apr 22, 2022 - 02:24 UTC
Update
We are continuing to work with the vendor and have been escalated through to the team responsible. We are also pursuing other options to mitigate the problem.
Posted Apr 22, 2022 - 01:53 UTC
Update
We are continuing to investigate the underlying storage performance with the vendor. We are observing that job dispatch is succeeding but is delayed.
Posted Apr 22, 2022 - 01:23 UTC
Identified
The problem has been identified on the underlying storage performance and we are confirming with the vendor and working on mitigations
Posted Apr 22, 2022 - 01:03 UTC
Investigating
We are currently investigating this issue.
Posted Apr 22, 2022 - 00:48 UTC
This incident affected: Web, Agent API, REST API, and Job Queue.