Jetbrains.com Availability Issue, Aug 13, 2020: Postmortem

Posted on by Eugene Toporov

Last week on Thursday you might have experienced the jetbrains.com website as unavailable for several minutes. We have performed a thorough analysis of the accident to prevent such issues in the future. We are sorry for the possible inconvenience caused.

Accident Summary

On August 13, 2020, the United States mirror of www.jetbrains.com was partially unavailable between 17:43 and 17:52 UTC.

Affected: US mirror of www.jetbrains.com

Not affected:

  • IDE authorization
  • Resellers Store
  • JetBrains Account
  • EU mirror of all apps, including www.jetbrains.com.

Accident Timeline

  • 17:43 UTC Applications lost their connection to the in-memory data store and failed the first health check.
  • 17:52 UTC Connection was restored, and applications were returned to normal

Root cause

Lack of a high availability configuration for the in-memory data structure store, along with a lack of DNS records availability health checks for some applications under www.jetbrains.com.

Action points

  • AWS Route 53, which is used as DNS provider for www.jetbrains.com, health checks will be extended with additional checks for webpages and the ability to switch traffic to healthy applications in another region.
  • Reconfiguration of in-memory data structure store to enable the multi-AZ and failover options.

Your JetBrains Team
The Drive to Develop