Incident History

February 2025

Logs read and write outage in prod-us-east-0 (including OTLP endpoint)
This incident has been resolved.
Feb 19, 21:30 - 22:21 UTC
OTEL Gateways Failing
This incident has been resolved.
Feb 18, 15:27 - 17:22 UTC

January 2025

Connection issues with data sources in prod-gb-south-0
We have identified the issue and rolled back a new alerting feature for re-release in the near future.
Jan 30, 18:33 - 19:19 UTC
Zabbix plugin issue with failing connections.
This incident has been resolved.
Jan 30, 12:03 - 14:03 UTC
[Scheduled] Internal Migration for Cloud Logs in prod-eu-west-2
The scheduled maintenance has been completed.
Jan 29, 11:00 - 12:00 UTC

December 2024

Degraded performance of k6 Cloud Insights API
This incident has been resolved.
Dec 17, 12:08 - 13:54 UTC
Degraded Performance in Loki
Today from 20:15-21:30 UTC a small sub-set of customers in the prod-us-east-0 region could see missing recording rule samples from this time period. This incident is currently resolved.
Dec 11, 20:00 - 20:00 UTC
Availability issue for Grafana Cloud stacks in prod-eu-west-2 region.
The incident has been resolved. The cause was one database server being under heavy CPU load - that caused database queries to either take a long time to complete or fail altogether for Grafana Cloud instances using that specific database server. As an outcome, it made some instances unavailable, and it also made new instance startups fail.
Dec 11, 10:28 - 11:49 UTC