Update - Since today at 11:55 UTC time we are seeing issues on the write path for Loki in cluster Azure Netherlands (eu-west-3). Impact will reflect in degradation of logs ingestion on that cluster. We are also reporting impact to Faro performance in the same region. Our engineering team is already working on restoring the service.
Mar 03, 2026 - 14:15 UTC
Investigating - Since today at 11:55 UTC time we are seeing issues on the write path for Loki in cluster Azure Netherlands (eu-west-3). Impact will reflect in degradation of logs ingestion on that cluster. Our engineering team is already working on restoring the service.
Mar 03, 2026 - 12:07 UTC
We will provide updates when we have them, but we do not have an expected resolution time at this point.
Mar 02, 2026 - 10:31 UTC
Update - Customers are recommended to configure a new blank stack in an alternative Grafana Cloud region and to reconfigure their clients (such as Grafana Alloy) to send telemetry to that region, Fleet Management can be used for this purpose https://grafana.com/docs/grafana-cloud/send-data/fleet-management/introduction/
Mar 02, 2026 - 10:04 UTC
Update - We are updating this incident to reflect a complete outage in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 02, 2026 - 08:36 UTC
Update - We are observing write and read outage errors across all databases (metrics, logs, traces) in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 02, 2026 - 08:21 UTC
Update - We are observing write and read outage errors across all databases (metrics, logs, traces) in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 02, 2026 - 08:14 UTC
Investigating - We are seeing elevated write and read path errors in prod-me-central-1, due to an on-going AWS UAE data center issue. We will provide further updates accordingly.
Mar 02, 2026 - 06:43 UTC
Update - We are continuing to investigate this issue alongside the CSP. Any notable updates will continue to be shared here for tracking.
Feb 27, 2026 - 22:05 UTC
Monitoring - We've implemented mitigation in place and are continuing to monitoring and investigating this issue.
Feb 27, 2026 - 14:55 UTC
Update - We have begun rolling out mitigation steps to reduce write latency in the prod-us-central-0 and prod-us-central-5 regions. While these measures are expected to improve performance, we are continuing to investigate the underlying root cause of the issue. We will provide additional updates as more information becomes available.
Feb 26, 2026 - 16:23 UTC
Investigating - Since February 19, we have been investigating an intermittent issue causing increased write latency in the prod-us-central-0 and prod-us-central-5 regions. The issue does not affect all traffic but may result in delayed write operations for some customers. Our engineering team is actively working to identify the root cause and stabilize performance. We will share additional updates as progress is made.
Feb 25, 2026 - 19:54 UTC
Grafana Cloud: Hosted Grafana
Operational
AWS Australia - prod-ap-southeast-2
Operational
AWS Brazil - prod-sa-east-1
Operational
AWS Canada - prod-ca-east-0
Operational
AWS Germany - prod-eu-west-2
Operational
AWS Germany - prod-eu-west-4
Operational
AWS India - prod-ap-south-1
Operational
AWS Japan - prod-ap-northeast-0
Operational
AWS UAE - prod-me-central-1
Operational
AWS Singapore - prod-ap-southeast-1
Operational
AWS Sweden - prod-eu-north-0
Operational
AWS US East - prod-us-east-0
Operational
AWS US East - prod-us-east-2
Operational
AWS US West - prod-us-west-0
Operational
AWS Australia - prod-au-southeast-1
Operational
AWS UK - prod-gb-south-1
Operational
Azure Netherlands - prod-eu-west-3
Operational
Azure US Central - us-central2
Operational
GCP Australia - prod-au-southeast-0
Operational
GCP Belgium - prod-eu-west-0
Operational
GCP Brazil - prod-sa-east-0
Operational
GCP India - prod-ap-south-0
Operational
GCP Singapore - prod-ap-southeast-0
Operational
GCP UK - prod-gb-south-0
Operational
GCP US Central - prod-us-central-0
Operational
GCP US Central - prod-us-central-3
Operational
GCP US Central - prod-us-central-4
Operational
GCP US East - prod-us-east-1
Operational
play.grafana.org
Operational
AWS Ireland - prod-eu-west-6
Operational
Grafana Cloud: Graphite
Operational
AWS Australia - prod-ap-southeast-2: Querying
Operational
AWS Australia - prod-ap-southeast-2: Ingestion
Operational
AWS Brazil - prod-sa-east-1: Querying
Operational
AWS Brazil - prod-sa-east-1: Ingestion
Operational
We will be performing a restart of all Auth API databases in AWS as part of planned maintenance.
Because the Auth API is a dependency for all Grafana Cloud services, this maintenance has the potential to impact all Grafana Cloud environments within the regions being restarted. However, the only expected user-facing impact during each restart window is for customers attempting to manage Grafana Cloud access policies.
Each database restart is expected to take up to 15 minutes, and we will monitor the process closely to ensure stability before continuing with additional restarts. Posted on
Mar 03, 2026 - 20:32 UTC
Completed -
The scheduled maintenance has been completed.
Mar 3, 19:00 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Mar 3, 17:00 UTC
Scheduled -
Upon successful completion of the first two regions restarted as part of our planned maintenance, we will proceed with restarting the remaining regions in AWS on March 3rd, 2026.
Because the Auth API is a dependency for all Grafana Cloud services, this maintenance has the potential to impact Grafana Cloud environments within the regions being restarted. However, the only expected user-facing impact during each restart window is for customers attempting to manage Grafana Cloud access policies.
Each database restart is expected to take up to 2 hours. As with the initial round, we will monitor each restart closely to ensure stability and service health before proceeding to the next region.
Feb 24, 16:01 UTC
Completed -
The scheduled maintenance has been completed.
Mar 2, 17:15 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Mar 2, 17:00 UTC
Scheduled -
We will be performing a restart of all Auth API databases in AWS as part of planned maintenance. To minimize risk, we will begin by restarting two regions and once successful, proceed with the remaining databases on March 3rd, 2026.
Because the Auth API is a dependency for all Grafana Cloud services, this maintenance has the potential to impact all Grafana Cloud environments within the regions being restarted. However, the only expected user-facing impact during each restart window is for customers attempting to manage Grafana Cloud access policies.
Each database restart is expected to take up to 15 minutes, and we will monitor the process closely to ensure stability before continuing with additional restarts.
Feb 23, 18:49 UTC
Resolved -
This incident has been resolved.
Mar 2, 15:48 UTC
Update -
We are now experiencing write outage for logs in prod-eu-west-3. Our Engineering team is aware and currently investigating this. We will provide further updates accordingly.
Mar 2, 08:08 UTC
Investigating -
We are experiencing increased write latency for logs in prod-eu-west-3. Our Engineering team is aware and currently investigating this. We will provide further updates accordingly.
Mar 2, 07:37 UTC
Resolved -
This incident has been resolved.
Feb 27, 23:38 UTC
Identified -
Our team has identified the issue, and are in the process of testing a fix.
Feb 27, 19:27 UTC
Investigating -
We're currently working on an issue where portions of data may be temporarily unretrievable, affecting a small percentage of tenants in all Tempo clusters.
Feb 27, 13:46 UTC
Resolved -
A recent rollout caused the AuthZ (RBAC) service to perform many redundant folder-tree fetches for each authorization check. For a small number of tenants in the prod-us-east-0 and prod-eu-west-2 regions with very large folder trees. This added a few milliseconds to every check, which increased request latency.
The approximate timeframe of the impact is:
2026-02-26 17:24:43 UTC to 2026-02-27 14:33:53 UTC.
Resolved -
This incident has been resolved.
Feb 27, 02:49 UTC
Update -
Uploads should work without an issue now. However, listing might still result in occasional timeouts - we're actively addressing this problem.
Feb 26, 14:43 UTC
Identified -
We're experiencing an issue for all Grafana Cloud regions, which manifest in slowness when uploading and listing sourcemaps. The issue most significantly affects users who have a large sourcemap files.
We've identified the issue and our team is currently working on a fix.
Feb 26, 13:00 UTC
Resolved -
This incident has been resolved.
Feb 25, 19:51 UTC
Monitoring -
A fix has been implemented, and we are observing recovery across all impacted regions. We will continue to monitor progress.
Feb 25, 18:46 UTC
Identified -
The issue has been identified, and we are in the process of rolling out a fix.
Feb 25, 18:31 UTC
Update -
While we work on narrowing down the scope, we can confirm that deployments in the prod-us-east-0 region are impacted.
Feb 25, 17:49 UTC
Investigating -
Some users may be experiencing issues loading dashboard and alert folders in Hosted Grafana. We will provide more information as it becomes available to us.
Feb 25, 17:44 UTC
Resolved -
This incident has been resolved.
Feb 25, 17:20 UTC
Monitoring -
A fix has been implemented and we are monitoring the results.
Feb 25, 15:55 UTC
Investigating -
We are currently investigating an issue which is causing a partial write, and rule evaluation outage in the specified region. We will continue to provide updates as they are available
Feb 25, 15:05 UTC
Resolved -
This incident has been resolved.
Feb 25, 15:05 UTC
Monitoring -
The fix was deployed to all affected, already existing tenants. All newly created tenants will not face the issue as well. We're monitoring the incident, but it should be resolved by now.
Feb 25, 12:53 UTC
Identified -
We identified an issue with the incorrect URL endpoint being shown for traces ingestion in prod-eu-west-6 region (AWS Ireland). Using the displayed URL will result in traces not being able to be ingested. The AWS private link ingestion should work without issues though.
The issue affects all tenants in this region and our team is in the process of deploying a fix to address this issue.
Feb 25, 12:41 UTC
Resolved -
This incident has been resolved.
Feb 24, 17:09 UTC
Monitoring -
A fix has been implemented, and we are monitoring results.
Feb 24, 16:27 UTC
Investigating -
We are currently investigating an issue impacting a subset of users in the prod-us-east-0 region. Impacted customers will receive a "failed to execute query" error when evaluating alert rules.
Feb 24, 14:31 UTC
Completed -
The scheduled maintenance has been completed.
Feb 20, 14:17 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Feb 20, 13:30 UTC
Scheduled -
Alert instances for Synthetic Monitoring ProbeFailedExecutionsTooHigh provisioned alert rule that are firing during the maintenance might resolve and fire again in the next evaluation.
Feb 20, 10:34 UTC
Completed -
The scheduled maintenance has been completed.
Feb 19, 13:34 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Feb 19, 13:00 UTC
Scheduled -
Possible user impact: Alert instances for Synthetic Monitoring ProbeFailedExecutionsTooHigh provisioned alert rule that are firing during the maintenance might resolve and fire again in the next evaluation.
Feb 19, 09:46 UTC
Resolved -
We experienced an issue impacting a cell within the Azure prod-us-central-7 region, which occurred between 14:26 and 14:36. Affected users may have noticed increased errors with rule evaluations, as well as a some read/write errors. We have resolved this issue, and will continue to monitor.
Feb 18, 14:00 UTC
Completed -
The scheduled maintenance has been completed.
Feb 18, 13:33 UTC
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Feb 18, 13:00 UTC
Scheduled -
Alert instances for Synthetic Monitoring ProbeFailedExecutionsTooHigh provisioned alert rule that are firing during the maintenance might resolve and fire again in the next evaluation. Only API is affected.
Estimated time window is 13:00–14:00 UTC
Impacted clusters are:
prod-me-central-1 prod-us-east-1 prod-ap-northeast-0 prod-gb-south-0 prod-us-east-3 prod-eu-central-0 prod-ap-south-1 prod-sa-east-1
Feb 18, 10:04 UTC
Resolved -
This incident has been resolved.
Feb 17, 16:27 UTC
Monitoring -
Alert instances for Synthetic Monitoring ProbeFailedExecutionsTooHigh provisioned alert rule that are firing during this maintenance might resolve and fire again in the next evaluation. Only the API is affected.
Estimated time window is 15:00–16:00 UTC
Impacted clusters are:
prod-eu-west-5 prod-us-east-4 prod-eu-west-6 prod-sa-east-0 prod-ap-south-0 prod-ap-southeast-0 prod-me-central-0 prod-au-southeast-0 prod-ap-southeast-2
Feb 17, 14:53 UTC
Resolved -
There was a service degradation today from ~12:09 UTC until ~12:35 UTC on the Public Probe of Calgary for Synthetic Monitoring. Impact may include SM check fails where the probe was used.
Feb 17, 12:47 UTC