Nov 25, 05:14 UTC
Error rates have remained at normal levels for some time now. Our monitoring shows that lost agents have been reconnected and are successfully running jobs.
Nov 21, 06:09 UTC
We’ve seen error rates return to normal levels for some time now. Our monitoring shows that lost agents have been reconnected and are successfully running jobs.
Nov 21, 05:54 UTC
Agents that were previously unresponsive should now be processing jobs again. Our advice to restart agent processes/instances no longer applies, as the issue has been fixed server-side.
Nov 21, 05:31 UTC
We are still in the process of restoring previously connected agents. We recommend cycling your agents by spinning up new ones and removing those experiencing issues in running jobs.
Nov 21, 04:58 UTC
We are in the process of trying to restore the connection to agents that were previously connected. As an interim measure, we recommend that you restart your agent instances
Nov 21, 04:33 UTC
We believe connected agents may be in a stuck state that is not operational and are continuing to investigate the cause and work on a fix. As an interim measure, we recommend cycling your agent instances to produce new agents
Nov 21, 04:09 UTC
We have promoted the new redis node to primary. We are now investigating agent connectivity and working to restore service.
Nov 21, 03:46 UTC
We are in the process of promoting the new redis cluster to primary.
Nov 21, 03:25 UTC
We have identified an issue with our redis cluster and are provisioning a new redis cluster and preparing to roll over.
Nov 21, 03:11 UTC
We are currently investigating an issue causing elevated error rates and high latency.
Nov 21, 02:54 UTC