You know how it feels. After releasing a new version, a service starts behaving in an unexpected way, and it’s up to you to save the day. But where to start? Criteo processes 150 billion requests per day, across more than 4,000 front-end servers. As part of the Criteo Performance team, our job is to investigate critical issues in this kind of environment.
In this talk, you will follow our insights, mistakes and false leads during a real world case. We will cover all the phases of the investigation, from the early detection to the actual fix, and we will detail our tricks and tools along the way.
Resources shared by Christophe and Kevin:
- Blog series about performance troubleshooting
- Tools used in the webinar:
- Grafana (dashboard) + blog series on how to capture ETW events
- dotTrace Timeline (profiling)
- ILSpy / dotPeek (.NET decompiler)
- WinDBG (hardcore debugging tool)
- ClrMD and DynaMD (NuGet packages for analyzing memory dumps)
- ClrMD extension for WinDBG (scripting in WinDBG)
- SysInternals tools (process dump)
Subscribe to our community newsletter to receive notifications about future webinars.