On 2025-08-27 from 04:28 UTC to 04:59 UTC, all customers were unable to create new builds.
During this period, only the "create build" functionality was affected. Running builds continued and completed without disruption. All other Buildkite features remained fully operational.
During a routine database schema migration, a required manual step to roll out the changes wasn't executed in the correct sequence. This caused new builds to fail due to a missing database field.
As a result, we observed a spike in 5XX responses from our application, and the number of created jobs dropped dramatically.
Our engineers quickly identified the issue and took immediate action by manually running the migration to create the missing field and restarting the application servers - rolling forward was found to be the faster resolution. The application recovered rapidly, allowing new builds to be created again.
Most of the builds that were attempted during the outage were successfully recovered after the application restart.
We are implementing enhanced guardrails within our schema migration process to automate the required sequence of operations and prevent such process failures in the future.