Post-Mortem of YouTrack InCloud Availability Issue
As you might have noticed, on Wednesday, February 20, there was the problem with YouTrack InCloud availability. The problem caused downtime at around 19.00 GMT and another downtime at around 23.00 GMT.
What was the problem
We use Nginx server which is focused on high concurrency, performance and low memory. On February 20 Nginx was unusually high loaded. This caused Out of memory error and we decided to restart Nginx on another server. We got 30 minutes downtime, finally. Honestly, longer than we expected.
What we’ve already done to solve this issue
Our Nginx was restarted on another server which is more powerful. Some technical issues which made downtime while moving to another server longer than expected were solved. All YouTrack InCloud instances should work fine since 23.30 GMT.
Please, contact our InCloud support if there is still a problem with your YouTrack instance.
What we are willing to do to prevent further issues
Current solution prevents similar problem for a long time. Though, we plan to configure second nginx server and use Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances.
We apologize for the inconvenience you might have experienced due our system downtime. We have done our best to solve the problem as soon as possible. This issue taught us a lesson how to achieve greater YouTrack InCloud fault tolerance with higher loads.
JetBrains YouTrack Team