Incident Report: Impact Dashboard and Impact Messages are unavailable.
Timeline:
Summary:
Between 3:38 PM and 10:52 PM UTC on March 6, 2025, a significant portion of users experienced slowness and unresponsiveness across the Impact platform. This was caused by a code change designed to enhance sub-account functionality which resulted in a much greater database load than anticipated, overloading the US-East database clusters. The incident resulted in widespread service disruption, preventing users from accessing and utilizing the platform.
Remediation:
Initial mitigation efforts involved scaling down the deployment platform to limit requests to the RDS clusters. However, this proved insufficient, and ultimately traffic had to be diverted from the affected cluster to allow the deployment platform to launch new instances with the rolled-back version.
We sincerely apologize for the inconvenience this incident caused. We value your understanding and are dedicated to continuous improvement of our platform's reliability and performance.