Connectivity Issues across accounts and devices

Incident Report for monday.com

Postmortem

Report on a Resolved System Downtime on Jan 29, 2020

This past Wednesday, we encountered a severe system outage. The monday.com platform was down for two hours and twenty minutes (Jan 29, 2020 - 17:42 UTC - Jan 29, 2020 - 20:03 UTC). During the downtime, while we worked relentlessly to resolve the issue, you couldn’t access your monday.com account.

We realize that you expect monday.com to be always available for your business, and we sincerely apologize for the inconvenience that this incident caused you and your team.

Your trust in us is our top priority and we believe you deserve full transparency as to our findings. Therefore, we take this opportunity to provide you with the details regarding this incident, as we currently understand them. Our investigation is ongoing, and we will provide updated information in the future.

What Happened?

One of our main databases encountered an unusual error that resulted in degraded performance that hindered consistent platform availability. The automatic switch to a different database (located in a different geographical availability zone) didn’t function as expected. Thus, we had to switch to a different instance of the database and validate its stability, a process which took time.

What’s Next?

We’re investigating the root cause of this failure together with our third party cloud computing service provider in order to fully understand it and accordingly, take preventative measures to avoid recurrence. In addition, our team has already begun an internal and extensive retrospective, to learn from this incident and to ensure, to the best of our ability, that it will not happen again.

We are fully committed to continuously communicating our findings to you. Please follow this incident summary link to find all details about the incident and our handling process, which will be updated as our investigation progresses.

Thank you.

Posted Feb 02, 2020 - 11:10 UTC

Resolved

monday.com is now fully operational, you should be able to return to your regular workflow upon refresh. Thank you again for your patience and cooperation, we appreciate it!

Posted Jan 29, 2020 - 20:32 UTC

Monitoring

The platform should be back up upon refreshing your browser! We're continuing to monitor performance and stability to ensure everything is working smoothly!

Posted Jan 29, 2020 - 20:05 UTC

Identified

We're making progress as our R&D team has identified the issue and is now working on restoring consistent access to your monday.com account. Your patience is highly appreciated, we're working hard to get you back to your regular workflow as soon as possible!

Posted Jan 29, 2020 - 19:57 UTC

Update

It's all hands on deck for us here as we keep investigating connectivity issues for everyone. We'll keep you posted as our efforts progress. We want to thank you for remaining patient so far and apologise for the inconvenience!

Posted Jan 29, 2020 - 19:35 UTC

Update

We're still hard at work trying to resolve connectivity issues for everyone. Further updates will be provided as soon as we know more!

Posted Jan 29, 2020 - 19:05 UTC

Update

We're still in the process of resolving connectivity issues for everyone. Further updates will be provided as soon as we know more, and we hope to have some good news for you soon!

Posted Jan 29, 2020 - 18:34 UTC

Investigating

We are currently investigating this issue.

Posted Jan 29, 2020 - 18:08 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Jan 29, 2020 - 18:04 UTC

Update

We are continuing to investigate this issue.

Posted Jan 29, 2020 - 18:02 UTC

Investigating

At the moment, we're experiencing connectivity issues for some accounts across all devices. Our development team is on the case and diligently working to resolve this matter as quickly as possible.

Posted Jan 29, 2020 - 17:51 UTC

This incident affected: US (Platform, Dashboards, Login / SSO, Notifications, Search, Automations, Integrations, API).