Completed -
The scheduled maintenance has been completed.
Jun 24, 16:07 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Jun 24, 12:00 UTC
Scheduled -
We'll be performing scheduled maintenance on the Cloud Migration Assistant ("Migrate to Grafana Cloud" feature). During this window, the ability to start or run migrations to Grafana Cloud will be temporarily unavailable while we roll the update out. What's affected: the Cloud Migration Assistant tooling only. What's not affected: your existing Grafana Cloud stacks — dashboards, metrics, logs, traces, alerting, and all other services continue to operate normally. If you're in the middle of a migration, you may need to re-create your migration snapshot once maintenance is complete. No action is required for existing, completed migrations. We'll update this notice as the maintenance progresses and confirm once it's complete. Thanks for your patience.
Jun 22, 13:15 UTC
Resolved -
This incident has been resolved. Error rates continued to remain near 0 and operations are performing as expected.
Jun 24, 12:46 UTC
Update -
Error rates have remained near zero, and we continue to monitor.
Jun 23, 17:54 UTC
Update -
We are continuing to monitor for further issues.
Jun 23, 13:15 UTC
Update -
We have deployed additional mitigations that should help with remaining errors. We are continuing to monitor error rates.
Jun 23, 09:14 UTC
Update -
We’ve verified and begun to implement a fix that will improve loading errors. We are continuing to roll this out to all regions and monitor for efficacy.
Jun 22, 17:16 UTC
Update -
We're actively monitoring this issue and working with our 3rd party provider. The next update will be sent on Monday unless there's new information to share.
Jun 19, 19:07 UTC
Update -
Due to the linked GCP outage below, users located in India may have trouble loading parts of Grafana.
We are continuing to work with our CSP on this investigation.
Impacted users may receive intermittent error messages such as "Error Loading" or "Failed to load Assets". To be clear, it does not matter the region the stack is located, but the geography where the user is physically in.
Jun 19, 02:33 UTC
Monitoring -
Due to the linked GCP outage below, users located in India may have trouble loading parts of Grafana.
Impacted users may receive intermittent error messages such as "Error Loading" or "Failed to load Assets". To be clear, it does not matter the region the stack is located, but the geography where the user is physically in. We continue to work with our CSP on this investigation.
Jun 18, 16:57 UTC
Investigating -
Due to the linked GCP outage below, users located in India may have trouble loading parts of Grafana.
Impacted users may receive error messages such as "Error Loading" or "Failed to load Assets". To be clear, it does not matter the region the stack is located, but the geography where the user is physically in. We are currently investigating this issue from our end, and will provide updates as they are available.
Jun 18, 15:18 UTC
Resolved -
Starting June 20, 2026 at approximately 20:00 UTC, some Grafana Cloud customers experienced unexpectedly elevated logs query usage. The issue persisted until it was resolved on June 22, 2026 at approximately 16:00 UTC.
Our engineering team identified and mitigated the issue. Systems have since stabilized and are operating normally. Grafana Labs is reviewing affected accounts for appropriate remediation.
Jun 20, 20:00 UTC
Resolved -
This incident has been resolved. Thank you for your patience.
Jun 19, 19:19 UTC
Update -
We’re continuing to track progress post-mitigation. While we don’t have new information to share yet, our team remains actively engaged.
Jun 19, 17:46 UTC
Monitoring -
We had an outage affecting rule evaluations between 15:16-15:59 UTC in the prod-us-central-0 region.
Our team quickly identified the issue and has since mitigated. The engineering team is monitoring.
Jun 19, 16:39 UTC
Resolved -
This incident has been resolved. Thank you for your patience.
Jun 18, 18:50 UTC
Monitoring -
We've verified a fix in our staging environment to restore functionality to the mobile app. The fix is currently being deployed to production. Thanks for your patience as we continue to roll this out and monitor the resolution.
Jun 18, 18:09 UTC
Identified -
We're noticing an uptick in users being unable to respond to actions on the mobile app (acknowledging and silencing alerts, for example). Users working in the web UI should not be affected. Ingestion and notification delivery are working as expected. We have a fix in place and are in the process of deploying.
Jun 18, 17:01 UTC
Resolved -
This incident has been resolved. Thank you for your patience.
Jun 18, 16:21 UTC
Update -
We are continuing to monitor for any further issues.
Jun 18, 14:04 UTC
Monitoring -
The root cause of the issue has been identified and a fix has been successfully deployed. We are observing widespread improvements across all systems. Our team is currently monitoring the environment to ensure performance remains stable.
Jun 18, 13:15 UTC
Update -
We are continuing to investigate this issue.
Jun 18, 11:45 UTC
Investigating -
We’re currently investigating an issue resulting in degraded k6 cloud UI performance and API response time. Our team is actively working to rectify this issue.
Jun 18, 11:25 UTC
Resolved -
This incident has been resolved.
Jun 18, 14:08 UTC
Monitoring -
A fix has been implemented and we are monitoring the results.
Jun 18, 08:04 UTC
Update -
We are continuing to deploy the fix and monitor recovery efforts. As part of the rollout, we identified an issue that required adjustments to our deployment plan, which has extended the timeline for mitigation. Work remains actively underway, and we will share additional updates as progress continues.
Jun 17, 23:22 UTC
Update -
Deployment of the fix is still in progress. We are continuing to monitor the rollout and validate recovery across affected systems. We will share further updates as they become available.
Jun 17, 21:43 UTC
Update -
Our Engineering Team has implemented a fix which is now being rolled out. We will continue to monitor the situation and update as soon as we have more information.
Jun 17, 20:55 UTC
Identified -
We have identified an issue where alert rules and alerts managed directly in a Loki data source (data source-managed alerting) are not displayed in the Grafana Cloud Alerting UI. Rules created via Prometheus/Mimir data sources and Grafana-managed alert rules are not affected.
Impact is limited to visibility and management in the UI. Affected alert rules continue to evaluate and send notifications normally — there is no impact to alert delivery.
Workaround: Loki alert rules can still be viewed and managed directly through the Loki ruler API (for example, using cortextool against /loki/api/v1/rules).
A fix has been identified and is in progress. We will provide a further update once it has been rolled out.
Jun 17, 20:17 UTC
Resolved -
A fix has been deployed and the issue after monitoring as been fixed.
Jun 18, 10:13 UTC
Investigating -
We’re currently investigating an issue affecting Frontend Observability product. The "Suspected commit" feature is not currently working as expected. Ingestion and querying is unaffected by this. Our team has identified the cause and is actively working on a fix. Thank you for your patience.
Jun 18, 08:48 UTC
Resolved -
Following our ongoing communications regarding the complete outage in prod-me-central-1, we are now closing this incident. As noted in the latest AWS update, the Middle East (UAE) region (ME-CENTRAL-1) has suffered significant damage and restoration is expected to take several months. We strongly recommend all affected customers migrate workloads to an alternate Grafana Cloud region as soon as possible.
For further details, please refer to the AWS incident communication directly: https://health.aws.amazon.com/health/status Please reach out to our Support team if you need any assistance with the above - https://grafana.com/profile/org#support We will continue to monitor the situation and update the incident once circumstances change.
Jun 17, 09:58 UTC
Update -
The TLS certificates serving prod-me-central-1 endpoints expire on May 30, 2026. Replacement certificates have been imported, but the ongoing AWS regional incident is preventing them from propagating to all load balancer nodes, so customers may see certificate errors after that date until AWS restores normal operation.
We do not have any additional updates to share at this time. Our team is actively monitoring the situation and will provide further information as it becomes available.
In the meantime, please continue to refer to the AWS Status Page for the most detailed and up-to-date information.
May 27, 17:10 UTC
Update -
AWS UAE - prod-me-central-1: Public Probe checks might suffer degraded experience. We recommend migrating checks from the UAE probe to the next nearest probe suitable for your use case.
May 21, 11:41 UTC
Update -
We do not have any additional updates to share at this time. Our team is actively monitoring the situation and will provide further information as it becomes available.
In the meantime, please continue to refer to the AWS Status Page for the most detailed and up-to-date information.
May 13, 21:59 UTC
Update -
We are continuing to investigate this issue.
Apr 20, 15:11 UTC
Update -
We have not received any further updates from AWS at this time. However, we are actively monitoring the outage and will provide additional information as it becomes available. Also, please continue to refer to the AWS status page for more detailed updates. https://health.aws.amazon.com/health/status
All the guidance previously included about stack migration is still relevant. Please reach out to our Support team if you have any questions.
Mar 19, 12:13 UTC
Update -
We are actively monitoring the situation, but at this time there are no new updates to share. The next update will be provided once we have more information to share. Please reach out to our Support team if you have any questions.
Mar 4, 22:22 UTC
Update -
We are continuing to investigate this issue.
Mar 4, 10:28 UTC
We will provide updates when we have them, but we do not have an expected resolution time at this point.
Mar 2, 10:31 UTC
Update -
Customers are recommended to configure a new blank stack in an alternative Grafana Cloud region and to reconfigure their clients (such as Grafana Alloy) to send telemetry to that region, Fleet Management can be used for this purpose https://grafana.com/docs/grafana-cloud/send-data/fleet-management/introduction/
Mar 2, 10:04 UTC
Update -
We are updating this incident to reflect a complete outage in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 2, 08:36 UTC
Update -
We are observing write and read outage errors across all databases (metrics, logs, traces) in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 2, 08:21 UTC
Update -
We are observing write and read outage errors across all databases (metrics, logs, traces) in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 2, 08:14 UTC
Investigating -
We are seeing elevated write and read path errors in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 2, 06:43 UTC
Resolved -
This incident has been resolved. Thank you for your patience.
Jun 12, 19:52 UTC
Update -
The incident has been mitigated, and services are operating normally. We continue to monitor the service to ensure full stability.
Jun 8, 21:36 UTC
Monitoring -
The incident has been mitigated, and services are operating normally. We are currently monitor the service to ensure full stability.
Jun 7, 11:00 UTC
Update -
We’re making ongoing progress on the investigation alongside our upstream provider.
Jun 7, 06:00 UTC
Update -
We are continuing to investigate this issue.
Jun 7, 03:50 UTC
Update -
Intermittent spikes in rule evaluations continuing.
Jun 7, 02:46 UTC
Investigating -
From 00:20:00 to 00:27:00 and again 00:32:00 to 00:38:00 there were brief spikes in rule evaluation failures. Engineers are investigating.
Jun 7, 01:39 UTC
Resolved -
Our team had discovered a read issue around 19:35-20:08 UTC. Impact at the time would have provided errors similar to context deadline exceeded (DatasourceError response). This has since been resolved, and should not have caused any data loss, only a short query disruption.
Jun 12, 18:30 UTC
Resolved -
This incident has been resolved.
Jun 11, 05:36 UTC
Investigating -
We’re currently investigating an issue affecting The Grafana Dashboards page. When set to view by folders, is currently experiencing an issue where no dashboards are shown. Our team is working on fixing the problem. In the meantime, switching to ‘View as list’ allows access to dashboards as usual”.
Jun 10, 10:49 UTC
Resolved -
We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Jun 10, 13:11 UTC
Monitoring -
Our team has implemented a fix and we are currently monitoring the results of this.
Jun 10, 10:57 UTC
Investigating -
We are currently investigating an issue affecting data source-managed alerting management functionality in Grafana Cloud.
Customers may experience problems viewing, creating, updating, or managing alerts through Grafana when using data source-managed alerting. This issue is limited to alert management functionality within Grafana.
Alert evaluation and backend alerting services continue to operate normally. Direct alerting APIs for Mimir and Loki remain fully operational and are unaffected.
Grafana-managed alerting is not impacted.
We identified this issue at approximately 20:45 UTC and are actively working on a resolution. We will provide additional updates as more information becomes available.
Workaround: Customers can continue to use the direct Mimir and Loki alerting APIs while we work to restore normal functionality.
Jun 9, 23:11 UTC