Write path outage in us-central1 region
Incident Report for Grafana Cloud
Resolved
Due to this bug reported in https://github.com/kubernetes/kubernetes/issues/127370, we were affected by an issue causing K8S service endpoints not getting updated when pods are stopped/started if there are more than 1k pods matching the service.
This caused a temporary outage in Mimir gossiping services, which further resulted in failures to ingest and query metrics for a short time.
This issue has been resolved.
Posted Nov 18, 2024 - 13:30 UTC