Outage in us-central1
Incident Report for Grafana Cloud
Resolved
All systems are operating normally.
Posted Nov 07, 2019 - 20:47 UTC
Update
Both the Prometheus and Graphite platforms are operating normally again.
Post-mortem to follow.
Posted Nov 07, 2019 - 19:37 UTC
Update
We are continuing to monitor for any further issues.
Posted Nov 07, 2019 - 19:10 UTC
Monitoring
A fix has been implemented and things are returning to normal. We're monitoring actively.
Posted Nov 07, 2019 - 19:06 UTC
Update
We're continuing to investigate and have reduce the number of failing requests by draining nodes and rolling back to older versions of the software.
Posted Nov 07, 2019 - 18:43 UTC
Update
We are continuing to investigate this issue.
Posted Nov 07, 2019 - 17:45 UTC
Update
We're believe our issues may be causes by network issues effecting ~3 machines; we have begun mitigation steps.
Posted Nov 07, 2019 - 17:32 UTC
Update
We're seeing issues with out Kubernetes cluster in us-central1 affect Prometheus reads and writes.
Posted Nov 07, 2019 - 17:16 UTC
Update
We are continuing to investigate this issue.
Posted Nov 07, 2019 - 17:14 UTC
Investigating
We are currently investigating this issue.
Posted Nov 07, 2019 - 17:13 UTC
This incident affected: Grafana.com, Grafana Cloud: Hosted Grafana (GCP US Central - prod-us-central-0), Grafana Cloud: Graphite (GCP US Central - prod-us-central-0: Querying, GCP US Central - prod-us-central-0: Ingestion), and Grafana Cloud: Prometheus (GCP US Central - prod-us-central-0: Querying, GCP US Central - prod-us-central-0: Ingestion).