Post-Mortem of YouTrack InCloud Availability Issue

As you might have noticed, on Wednesday, February 20, there was the problem with YouTrack InCloud availability. The problem caused downtime at around 19.00 GMT and another downtime at around 23.00 GMT.

What was the problem

We use Nginx server which is focused on high concurrency, performance and low memory. On February 20 Nginx was unusually high loaded. This caused Out of memory error and we decided to restart Nginx on another server. We got 30 minutes downtime, finally. Honestly, longer than we expected.

What we’ve already done to solve this issue

Our Nginx was restarted on another server which is more powerful. Some technical issues which made downtime while moving to another server longer than expected were solved. All  YouTrack InCloud instances should work fine since 23.30 GMT.

Please, contact our InCloud support if there is still a problem with your YouTrack instance.

What we are willing to do to prevent further issues

Current solution prevents similar problem for a long time. Though, we plan to configure second nginx server and use Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances.

We apologize for the inconvenience you might have experienced due our system downtime. We have done our best to solve the problem as soon as possible. This issue taught us a lesson how to achieve greater YouTrack InCloud fault tolerance with higher loads.

Best regards,
JetBrains YouTrack Team

This entry was posted in events and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">