We have continued to observe stable workloads and the cluster remains healthy. Engineering has completed coordination and communication efforts with the third-party vendor and, as such, we are considering this issue resolved.
No further updates.
Posted Mar 08, 2023 - 13:19 UTC
Update
We're continuing to monitor our prod-us-central-0 cluster. The cluster is healthy, workloads are stable.
Posted Mar 07, 2023 - 00:18 UTC
Monitoring
Engineering has made a change to mitigate the issue. As of 16:22 UTC, pods in the us-prod-central-0 cluster have entered into an appropriate state and are exhibiting normal behavior. Instances in the us-prod-central-0 cluster should no longer experience adverse behavior.
We will continue to monitor at this time.
Posted Mar 03, 2023 - 16:30 UTC
Identified
As of 15:45 UTC, we have observed pods re-entering a poor state within the us-prod-central-0 cluster, only. The underlying issue is stemming from a third-party vendor, and Engineering is actively engaged to restore functionality.
Instances within the us-prod-central-0 cluster may experience an inability to initialize or start. We will continue to provide updates as we learn more and undertake mitigation efforts.
Posted Mar 03, 2023 - 16:01 UTC
Update
Engineering continues to engage with the third-party vendor while we await further details surrounding root cause.
We've continued to observe normal pod behaviors and will continue to monitor.
Posted Mar 02, 2023 - 20:36 UTC
Update
We continue to observe normal pod behavior and instance functionality for the us-prod-central-0 cluster.
We will continue to monitor the state of this issue, pending information from the third-party vendor surrounding details of the poor cluster behavior.
Posted Mar 01, 2023 - 22:58 UTC
Monitoring
Engineering has made a change to mitigate the issue. As of 21:40 UTC, pods in the us-prod-central-0 cluster have entered into an appropriate state and are exhibiting normal behavior. Instances in the us-prod-central-0 cluster should no longer experience adverse behavior.
We will continue to monitor the state of this issue and relay any additional updates.
Posted Mar 01, 2023 - 21:48 UTC
Identified
As of 20:44 UTC, we were alerted to a quantity of pods entering a poor state within the us-prod-central-0 cluster, only. The underlying issue is stemming from a third-party vendor, and Engineering is actively engaged to restore functionality.
Instances within the us-prod-central-0 cluster may experience an inability to initialize or start. We will continue to provide updates as we learn more.
Posted Mar 01, 2023 - 20:56 UTC
This incident affected: Grafana Cloud: Hosted Grafana (GCP US Central - prod-us-central-0).