On Wednesday November 6th, many Squarespace websites were unavailable for 102 minutes between 14:13 and 15:55 ET. Site visitors saw slow loads or “Service/Unavailable” (status 503) errors.
The incident was caused by an upgrade to our production database software the previous day. This change had been deployed to our internal testing environments and had been performing as expected for several days. Given this, we deployed the upgrade to our main application cluster on November 5th. The next day, just after 14:00 ET, we experienced a cascading failure of all hosts in our main application database cluster. This effectively disabled our main application for new requests. While we were still able to serve some traffic from our caches, no new requests could be completed.
Rollback of the upgrade began just after 15:00 ET. This allowed us to bring the main application database back online, at which point our service recovered.
We’re still investigating why the upgrade caused this behavior. We intend to reproduce the behavior in our lab and work with our vendor to ensure the bug is patched before we upgrade again.
We deeply apologize for this incident. It is of the utmost importance to us that Squarespace sites be up and available. Thank you for your patience.