Report on a Resolved System Downtime on Jan 29, 2020
This past Wednesday, we encountered a severe system outage. The monday.com platform was down for two hours and twenty minutes (Jan 29, 2020 - 17:42 UTC - Jan 29, 2020 - 20:03 UTC). During the downtime, while we worked relentlessly to resolve the issue, you couldn’t access your monday.com account.
We realize that you expect monday.com to be always available for your business, and we sincerely apologize for the inconvenience that this incident caused you and your team.
Your trust in us is our top priority and we believe you deserve full transparency as to our findings. Therefore, we take this opportunity to provide you with the details regarding this incident, as we currently understand them. Our investigation is ongoing, and we will provide updated information in the future.
What Happened?
One of our main databases encountered an unusual error that resulted in degraded performance that hindered consistent platform availability. The automatic switch to a different database (located in a different geographical availability zone) didn’t function as expected. Thus, we had to switch to a different instance of the database and validate its stability, a process which took time.
What’s Next?
We’re investigating the root cause of this failure together with our third party cloud computing service provider in order to fully understand it and accordingly, take preventative measures to avoid recurrence. In addition, our team has already begun an internal and extensive retrospective, to learn from this incident and to ensure, to the best of our ability, that it will not happen again.
We are fully committed to continuously communicating our findings to you. Please follow this incident summary link to find all details about the incident and our handling process, which will be updated as our investigation progresses.
Thank you.