Resolved -
Between 18:32 and 18:46 UTC, Grafana Cloud Metrics within a single cell residing in the prod-us-west-0 region experienced an issue affecting write ingestion only. During this time, some metric writes may have failed or been delayed. Metric reads were not impacted and remained fully available throughout the incident.
Our engineering team quickly identified the cause of the issue and implemented mitigation steps to restore normal write ingestion. The service has been operating normally since 18:46 UTC.
Feb 5, 18:30 UTC
Resolved -
This incident has been resolved.
Feb 5, 17:41 UTC
Monitoring -
Services recovered and there's no active issue anymore. We're still monitoring the overall health.
Feb 5, 14:40 UTC
Investigating -
We're experiencing an issue in us-central-0 region for Hosted Metrics offering - the issue manifest in rule evaluations failing, and possibility of queries returning stale data. We're actively investigating the cause of the issue.
Feb 5, 14:14 UTC
Resolved -
This incident has been resolved.
Feb 5, 15:31 UTC
Monitoring -
The issue causing the incident has been identified, and the fix has been deployed. All new test runs work consistently
Feb 5, 09:36 UTC
Update -
We are continuing to work on a fix for this issue.
Feb 4, 19:57 UTC
Identified -
We encountered a subtle bug which caused our test-run finalization process to read from stale threshold status because of a synchronization issue.
We have since resolved the bug, and new test runs will work properly. Impacted test runs will need to be fixed via further correction on our end. We will continue to provide updates on the progress of the fix for impacted test runs.
Feb 4, 17:20 UTC
Resolved -
The HTTP Response time in the Performance trend overview did not show for new test-runs. After the fix, all data should show again.
Feb 4, 14:14 UTC
Resolved -
This incident has been resolved.
Feb 3, 17:56 UTC
Monitoring -
A fix as implemented, and we are seeing recovery throughout the rollout. We will continue to monitor results.
Feb 3, 17:38 UTC
Identified -
The issue has been identified and we are implementing a fix.
Feb 3, 17:29 UTC
Investigating -
As of 16:40 UTC, we are currently investigating an issue where IRM pages are not accessible. Users may experience errors or be unable to load IRM-related pages during this time.
Our team is actively working to identify the root cause and restore full functionality as quickly as possible. We will provide updates as more information becomes available.
Feb 3, 16:52 UTC
Resolved -
This incident has been resolved.
Jan 28, 20:25 UTC
Monitoring -
A fix has been implemented, and we are monitoring the results.
Jan 28, 18:24 UTC
Investigating -
We are currently investigating an issue impacting dashboards for users in the prod-us-central-3 region. This is preventing impacted dashboards from loading as expected.
This is also impacting a very small subset of users in the prod-us-central-0 region as well.
We will provide more details regarding the scope as they become available.
Jan 28, 17:27 UTC
Resolved -
We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Jan 28, 00:22 UTC
Monitoring -
As of 22:55 UTC, we have observed marked improvement with the incident impacting IRM and OnCall. We are still investigating and will continue to monitor and provide updates.
Jan 27, 22:56 UTC
Investigating -
We are currently investigating an issue impacting some customers when accessing Grafana Oncall and IRM. Impacted customers may experience long load times, or even time-outs when attempting to access these components. We'll provide more information as it becomes available.
Jan 27, 20:37 UTC
Resolved -
We were experiencing increased write error rate for logs in prod-us-west-0 from 6:55 to 7:15 UTC. We have since observed continued stability and are marking this as resolved.
Jan 27, 07:49 UTC
Resolved -
Engineering has released a fix and as of 00:13 UTC, customers should no longer experience issues upgrading from Free to Pro subscriptions. At this time, we are considering this issue resolved. No further updates.
Jan 27, 00:13 UTC
Identified -
Engineering has identified the issue and is currently exploring remediation options. At this time, users will continue to experience the inability to upgrade from Free to Pro subscriptions.
We will continue to provide updates as more information is shared.
Jan 26, 21:52 UTC
Investigating -
As of 20:05 UTC, our engineering team became aware of an issue related to subscription plan upgrades. Users experiencing this issue will not be able to upgrade from a Free plan to a Pro subscription.
Engineering is actively engaged and assessing the issue. We will provide updates accordingly.
Jan 26, 20:53 UTC
Resolved -
This incident has been resolved.
Jan 23, 18:44 UTC
Monitoring -
We are noticing significant improvement, and things are stabilizing as expected. Our engineering teams will continue to monitor progress.
Jan 23, 16:55 UTC
Investigating -
We are currently investigating an issues impacting Email delivery for some Services, including Alert Notifications.
Jan 23, 15:37 UTC
Resolved -
The incident is resolved. We are in contact with customers affected by this change.
Jan 22, 22:29 UTC
Identified -
During the secrets migration in https://status.grafana.com/incidents/47d1q4sphrmj, secrets proxy URLs for some customers updated in the following regions: prod-us-central-0, prod-us-east-0, and prod-eu-west-2. This was an unexpected breaking change affecting a subset of customers.
This will specifically affect customers who are using secrets on private probes behind a firewall.
We are investigating. If your private probes are impacted, we ask you to update firewall rules for the secrets proxy to allow outbound connections to the updated hosts:
Note that this URL change affects only a small subset of customers, the majority of customers will not need to update firewall rules. For affected customers, private probes will show the following error in probe logs, for example: Error during test execution: failed to get secret: Get "https://gsm-proxy-prod-us-east-2.grafana.net/api/v1/secrets/.../decrypt": Forbidden undefined
Jan 21, 21:16 UTC