Incident History

April 2025

Graphite proxy ingestion failing - prod-ap-northeast-0
This incident has been resolved
Apr 15, 08:58 - 09:27 UTC
AWS Firehose Logs Down in Prod-US-East-2
This incident has been resolved.
Apr 11, 18:45 - 19:55 UTC
Cloudwatch datasource query issues on cluster GPC US central
We continue to observe a continued period of recovery. At this time, we are considering this issue resolved.
Apr 8, 14:52 - Apr 11, 08:03 UTC

March 2025

Hosts unreachable for instances in prod-sa-east-1
From 6:23 UTC to 6:39 UTC, we became aware of an issue with instances in the prod-sa-east-1 cluster. Users experiencing this issue may have encountered with requests failing for instances in this cluster. This has stabilised and customers should no longer experience this issue.
Mar 28, 07:59 - 07:59 UTC
Not possible to create stacks on prod-us-west-0
The stack creation process is now functioning properly in the prod-us-west-0 region.
Mar 26, 13:39 - 14:03 UTC
Some Grafana Instances Taking Longer to Initialize
This incident has been resolved.
Mar 17, 18:49 - Mar 20, 13:33 UTC

February 2025

Degraded performance due to overloaded internal queues
Our internal queues were affected by a scheduled data migration. For around three hours asynchronous scheduled tasks were affected and their processing delayed, but none cancelled or lost. Test scheduled particularly may have been not run at their intended time during that period.
Feb 28, 13:09 - 13:09 UTC
Degraded UX with IRM Slack Integration.
This incident has been resolved.
Feb 26, 18:13 - Feb 27, 12:55 UTC
New instance creation taking longer than expected
This incident has been resolved.
Feb 25, 22:27 - Feb 26, 06:52 UTC