Degraded performance and request timeouts

Incident Report for Buildkite

Resolved

We have resolved the issue with bad query plans in our database causing inefficient queries that triggered increased latency and error rates. We continue to investigate the cause and further mitigations that are necessary to prevent the issue from re-occurring.
Posted May 14, 2025 - 00:04 UTC

Update

We are continuing to monitor for any further issues.
Posted May 14, 2025 - 00:04 UTC

Monitoring

We’re seeing improved response times and reduced error rates following a deployment of our change to improve the query plan efficiency. We continue to monitor
Posted May 13, 2025 - 23:53 UTC

Update

We are deploying a change to improve database performance and resolve the incorrect query plan on a single shard. The impact is contained to a subset of customers on the impacted database. We will provide an update in the next 20 minutes on our progress.
Posted May 13, 2025 - 23:30 UTC

Identified

We've identified an incorrect database query plan that is affecting some customers. We're working to resolve.
Posted May 13, 2025 - 22:43 UTC

Investigating

We're experiencing degraded performance and query timeouts for a subset of customers. We're currently investigating the cause.
Posted May 13, 2025 - 22:26 UTC
This incident affected: Web, Agent API, and REST API.