On Monday, 3rd September at 01:37 UTC the Buildkite Agent API experienced elevated error rates. This was due to high load on our backend RDS database caused by high load, data migrations, and a vacuum process. The migration and vacuum processes were cancelled and service was restored by 01:39 UTC. Agents retry behaviour should have handled these failures without data loss. Some build pipeline uploads may have failed and required retry.
These vacuums have been rescheduled to run on weekends during quiet periods, and the data migrations have been modified to pause more to allow normal operations.