What Happened:
On June 12th (8:26 EST AM), we experienced a service disruption in our EU region that resulted in 68 minutes of degraded performance and 20 minutes of full downtime for several features, including API access, automations, notifications, and communications.
What caused it:
A backend infrastructure issue caused unexpected pressure on our caching systems, leading to widespread delays and partial outages across key services.
How we responded:
Our engineering teams quickly identified the root cause, rolled back the triggering changes, and scaled our systems to restore stability. Full functionality was gradually restored within a few hours.
What we're doing to prevent recurrence:
We’re implementing additional safeguards to detect anomalies earlier, improving system scalability, and strengthening internal processes to prevent similar incidents in the future
No data loss occurred during this incident.
We sincerely apologize for any disruption this caused to your workflow.
Thank you for your understanding as we work to continuously improve our platform's reliability.
Your team at monday.com