Resolved -
This incident has been resolved.
Feb 11, 21:47 UTC
Monitoring -
We are in the process of rolling out the fix.
Feb 11, 18:20 UTC
Identified -
We have identified the issue, and are working on a fix.
Feb 11, 16:22 UTC
Investigating -
We are aware of an issue that is preventing the installation of the Slack integration. We are currently investigating this, and will provide updates as they become available.
Feb 11, 14:21 UTC
Resolved -
We continue to observe a continued period of recovery. At this time, we are considering this issue resolved.
Feb 10, 01:45 UTC
Investigating -
As of 00:10, we are currently experiencing write failures in a single cell affecting customers in prod-us-central-0. Impacted customers may see failed or dropped writes.
Engineering is actively engaged and assessing the issue. We will provide updates accordingly.
Feb 10, 00:39 UTC
Resolved -
This incident has been resolved.
Feb 9, 11:21 UTC
Update -
We are continuing to monitor for any further issues.
Feb 9, 10:36 UTC
Monitoring -
Between 09:47 and 10:14 UTC, Grafana Cloud Logs within a single cell residing in the prod-ap-southeast-1 region experienced an issue affecting write ingestion only. During this time, some log writes may have failed or been delayed. Log reads were not impacted and remained fully available throughout the incident.
Our engineering team quickly identified the cause of the issue and are monitoring the service. The service has been operating normally since 10:14 UTC.
Feb 9, 10:32 UTC
Resolved -
Between 18:32 and 18:46 UTC, Grafana Cloud Metrics within a single cell residing in the prod-us-west-0 region experienced an issue affecting write ingestion only. During this time, some metric writes may have failed or been delayed. Metric reads were not impacted and remained fully available throughout the incident.
Our engineering team quickly identified the cause of the issue and implemented mitigation steps to restore normal write ingestion. The service has been operating normally since 18:46 UTC.
Feb 5, 18:30 UTC
Resolved -
From 17:43 UTC to 18:05 UTC, a subset of customers experienced elevated latency and a peak error rate of approximately 22% for trace ingestion.
Feb 5, 18:00 UTC
Resolved -
This incident has been resolved.
Feb 5, 17:41 UTC
Monitoring -
Services recovered and there's no active issue anymore. We're still monitoring the overall health.
Feb 5, 14:40 UTC
Investigating -
We're experiencing an issue in us-central-0 region for Hosted Metrics offering - the issue manifest in rule evaluations failing, and possibility of queries returning stale data. We're actively investigating the cause of the issue.
Feb 5, 14:14 UTC
Resolved -
This incident has been resolved.
Feb 5, 15:31 UTC
Monitoring -
The issue causing the incident has been identified, and the fix has been deployed. All new test runs work consistently
Feb 5, 09:36 UTC
Update -
We are continuing to work on a fix for this issue.
Feb 4, 19:57 UTC
Identified -
We encountered a subtle bug which caused our test-run finalization process to read from stale threshold status because of a synchronization issue.
We have since resolved the bug, and new test runs will work properly. Impacted test runs will need to be fixed via further correction on our end. We will continue to provide updates on the progress of the fix for impacted test runs.
Feb 4, 17:20 UTC
Resolved -
The HTTP Response time in the Performance trend overview did not show for new test-runs. After the fix, all data should show again.
Feb 4, 14:14 UTC
Resolved -
This incident has been resolved.
Feb 3, 17:56 UTC
Monitoring -
A fix as implemented, and we are seeing recovery throughout the rollout. We will continue to monitor results.
Feb 3, 17:38 UTC
Identified -
The issue has been identified and we are implementing a fix.
Feb 3, 17:29 UTC
Investigating -
As of 16:40 UTC, we are currently investigating an issue where IRM pages are not accessible. Users may experience errors or be unable to load IRM-related pages during this time.
Our team is actively working to identify the root cause and restore full functionality as quickly as possible. We will provide updates as more information becomes available.
Feb 3, 16:52 UTC