Date of Incident: August 12, 2025
Overview
On August 12, customers in our CA-CENTRAL-1 region experienced degraded service. The root cause was a significant power outage within one of our cloud provider's (Amazon Web Services) Availability Zones. This outage impacted the underlying infrastructure, including EC2 instances and EBS volumes, which support our SaaS platform. Our engineering team engaged immediately to monitor the situation and mitigate customer impact where possible. Service was fully restored once our provider resolved the power issue and all systems recovered.
Timeline of Events (MDT)
- 3:58 MDT: Automated monitoring detected service impact for customers in the CA-CENTRAL-1 region, and our on-call engineering team was engaged.
- 5:21 MDT: Our cloud provider reported that power had been restored to the affected data center. We observed initial signs of recovery across our platform.
- 5:55 MDT: All systems were confirmed to be fully operational, and normal service levels were restored for all impacted customers.
Resolution
The incident was fully resolved at 5:55 MDT. The total duration of the impact, from detection to full resolution, was 1 hour and 57 minutes.
Next Steps
To improve resilience against future upstream provider issues, we are taking the following actions:
- Review Regional Redundancy: We will conduct a thorough review of our regional redundancy and failover procedures to reduce the potential impact of single-region provider outages.
- Enhance Provider Collaboration: We will continue to collaborate with our cloud provider to improve strategies for early detection, transparent communication, and faster mitigation of infrastructure-level incidents.