All Systems Operational

About This Site

Status updates for Buildkite’s services and components. You can also follow @buildkitestatus on Twitter for updates.

Web Operational
90 days ago
99.72 % uptime
Today
Agent API Operational
90 days ago
99.79 % uptime
Today
REST API Operational
90 days ago
99.84 % uptime
Today
Job Queue Operational
90 days ago
100.0 % uptime
Today
SCM Integrations Operational
90 days ago
100.0 % uptime
Today
Notifications Operational
90 days ago
99.98 % uptime
Today
GitHub Commit Status Notifications Operational
Email Notifications Operational
Slack Notifications Operational
Hipchat Notifications Operational
Webhook Notifications Operational
90 days ago
99.98 % uptime
Today
SCM Providers ? Operational
GitHub Operational
GitHub API Requests Operational
Atlassian Bitbucket SSH Operational
Atlassian Bitbucket Website and API Operational
Atlassian Bitbucket Git via HTTPS Operational
Third Party Services ? Operational
AWS ec2-us-east-1 Operational
AWS elasticache-us-east-1 Operational
AWS elb-us-east-1 Operational
AWS rds-us-east-1 Operational
PagerDuty Notification Delivery Operational
Test Analytics Operational
90 days ago
99.71 % uptime
Today
Web ? Operational
90 days ago
99.71 % uptime
Today
Ingestion ? Operational
90 days ago
99.71 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Web Response Time ?
Fetching
Agent API Response Time ?
Fetching
REST API Response Time ?
Fetching
Agent Job Dispatch
Fetching
HTTP Request Error Rate ?
Fetching
Past Incidents
Dec 1, 2022

No incidents reported today.

Nov 30, 2022

No incidents reported.

Nov 29, 2022

No incidents reported.

Nov 28, 2022

No incidents reported.

Nov 27, 2022

No incidents reported.

Nov 26, 2022

No incidents reported.

Nov 25, 2022

No incidents reported.

Nov 24, 2022

No incidents reported.

Nov 23, 2022

No incidents reported.

Nov 22, 2022

No incidents reported.

Nov 21, 2022
Postmortem - Read details
Nov 25, 05:14 UTC
Resolved - Error rates have remained at normal levels for some time now. Our monitoring shows that lost agents have been reconnected and are successfully running jobs.
Nov 21, 06:09 UTC
Monitoring - We’ve seen error rates return to normal levels for some time now. Our monitoring shows that lost agents have been reconnected and are successfully running jobs.
Nov 21, 05:54 UTC
Update - Agents that were previously unresponsive should now be processing jobs again. Our advice to restart agent processes/instances no longer applies, as the issue has been fixed server-side.
Nov 21, 05:31 UTC
Update - We are still in the process of restoring previously connected agents. We recommend cycling your agents by spinning up new ones and removing those experiencing issues in running jobs.
Nov 21, 04:58 UTC
Update - We are in the process of trying to restore the connection to agents that were previously connected. As an interim measure, we recommend that you restart your agent instances
Nov 21, 04:33 UTC
Update - We believe connected agents may be in a stuck state that is not operational and are continuing to investigate the cause and work on a fix. As an interim measure, we recommend cycling your agent instances to produce new agents
Nov 21, 04:09 UTC
Update - We have promoted the new redis node to primary. We are now investigating agent connectivity and working to restore service.
Nov 21, 03:46 UTC
Update - We are in the process of promoting the new redis cluster to primary.
Nov 21, 03:25 UTC
Identified - We have identified an issue with our redis cluster and are provisioning a new redis cluster and preparing to roll over.
Nov 21, 03:11 UTC
Investigating - We are currently investigating an issue causing elevated error rates and high latency.
Nov 21, 02:54 UTC
Nov 20, 2022

No incidents reported.

Nov 19, 2022

No incidents reported.

Nov 18, 2022

No incidents reported.

Nov 17, 2022
Postmortem - Read details
Nov 22, 04:44 UTC
Resolved - The Notifications service is operating at normal levels and the queues caused by the incident have been cleared.
Nov 17, 02:31 UTC
Monitoring - The notifications service is now operating as expected, we continue to monitor the service to ensure no further degradation.
Nov 17, 02:05 UTC
Update - Queue latency is now down to 10 minutes, we continue to work on reducing the queue latency further.
Nov 17, 01:58 UTC
Update - We have prioritised service notifications over other background work, we expect this to further reduce latency in delivering notifications.
Nov 17, 01:42 UTC
Update - We are working on further scaling up the processing of background jobs in order to reduce the queue size caused by this outage.
Nov 17, 01:16 UTC
Update - We’ve identified an increased number of background jobs consuming capacity shared between notifications and other background jobs. We are provisioning increased dedicated capacity for notifications jobs.
Nov 17, 00:52 UTC
Identified - We’ve identified an increased number of background jobs consuming capacity shared between notifications and other background jobs. We are provisioning increased dedicated capacity for notifications jobs.
Nov 17, 00:52 UTC