Elevated error rate on build creation

Incident Report for Buildkite

Resolved

A change to our database permissions caused creation of new builds to fail for some customers.
Affected customers trying to create new builds via the API would have received a 500 error.
Builds created via webhook (e.g. from Github), trigger steps or scheduled builds were delayed by up to 8 minutes.
Posted Oct 10, 2024 - 03:05 UTC

Monitoring

We have deployed a mitigation to fix the issue, and are now monitoring.

During the period of higher error rates, builds created via API may have failed outright. Builds created via webhooks, triggers or schedules are retried, though there may be some latency in processing these now due to retry back-off.
Posted Oct 10, 2024 - 02:41 UTC

Investigating

Our monitoring has detected an elevated error rate in creating builds. We're currently investigating the issue, and will provide an update soon.
Posted Oct 10, 2024 - 02:30 UTC
This incident affected: Job Queue.