YouTrack InCloud Unavailability Post-Mortem

Valerie Andrianova

Last Saturday, July 11th, some of our InCloud instances became unavailable. The problem was resolved for all instances by Monday morning, July 13th. Please accept our deepest apologies for the inconvenience you may have experienced during this downtime.

What happened?

On Saturday evening, Amazon restarted our server where one of our Nginx is located. Nginx restart took about half an hour. This unplanned restart and general infrastructure problem caused a massive backend restart. Our auto-recovery mechanism inside the InCloud management system couldn’t handle it. As a result, a number of InCloud instances failed to start automatically. We had to restart them manually.

Solution

The problem in our InCloud management system was detected and is already fixed. The solution will prevent such problems in future. Please accept our apologies once again.

Subscribe

Subscribe for updates