Elevated error rates
Incident Report for Buildkite
Resolved
Latency has returned to normal levels and the issue is now resolved
Posted Feb 28, 2023 - 22:54 UTC
Update
We continue to see latency return to normal levels and continue to monitor the affected services
Posted Feb 28, 2023 - 22:22 UTC
Monitoring
We are seeing latency return to normal levels and continue to monitor the affected services
Posted Feb 28, 2023 - 21:51 UTC
Update
Latency has improved but remains elevated. We are continuing to manage load while we investigate performance issues.
Posted Feb 28, 2023 - 21:35 UTC
Update
A customer was automatically generating a high volume of builds in error. We have cancelled those builds at that customer's request and have shipped a change to allow us to rate limit new builds on a per-customer basis, which we have enabled. This new rate limit is only applied to a single customer.


Job dispatch is still significantly delayed, and we are actively managing capacity to restore service to normal levels.
Posted Feb 28, 2023 - 21:10 UTC
Update
A customer was automatically generating a high volume of builds in error. We have cancelled those builds at that customer's request and have shipped a change to allow us to rate limit new builds on a per-customer basis, which we have enabled. New build requests exceeding the rate limit will be served an HTTP 429 error.

Job dispatch is still significantly delayed, and we are actively managing capacity to restore service to normal levels.
Posted Feb 28, 2023 - 21:04 UTC
Update
We continue to investigate possible mitigations for the issue. We are implementing additional controls to reduce the amount of work in the system
Posted Feb 28, 2023 - 19:55 UTC
Update
We continue to investigate possible mitigations for the issue. We are implementing additional controls to reduce the amount of work in the system
Posted Feb 28, 2023 - 19:23 UTC
Update
We are activating additional controls to reduce the amount of work in the system.
Posted Feb 28, 2023 - 18:44 UTC
Identified
We identified the high latency's cause and we are currently working on reducing the load.
Posted Feb 28, 2023 - 18:29 UTC
Update
We are continuing to investigate the issue. We are observing an elevated error rate
Posted Feb 28, 2023 - 18:09 UTC
Investigating
We are currently investigating this issue.
Posted Feb 28, 2023 - 17:54 UTC
This incident affected: Web, Agent API, and REST API.