.NET Tools

Essential productivity kit for .NET and game developers

Get Tools

.NET Tools JetBrains AI Rider

Your AI Agent Keeps Missing The Real Bottleneck. JetBrains Rider Can Fix It Now.

Sasha Ivanova

Here’s a case worth pondering: your app freezes for ten seconds, and you ask an AI agent what’s wrong. What does it actually do? For a long time the honest answer was: it rummages through your code and takes a wild guess.

A snapshot taken by a profiler tool is runtime evidence. It knows exactly where the CPU went. But an agent with no access to profiling can’t read it. So it does the only thing it can: scans the project, finds some plausible-looking inefficiencies, and confidently presents them as the bottleneck. Sometimes it gets lucky. On a real freeze, it usually doesn’t.

We’ve been building something to fix that in Rider: a dotTrace-backed profiling skill for the agents inside AI Assistant, called dottrace-analyze. The idea is very straightforward. You hand the agent a .dtp snapshot you already captured with dotTrace – using the standalone profiler, the command-line tool, or dotTrace inside Rider – and instead of wandering through your source code, it reads the profile first. It finds where the time actually went, follows the hot path back into your code, and explains what’s slow and why, with recommendations for what to look at next.

We ran the evals. To keep the scoring from becoming a personal judgment call, each answer was evaluated against a reference root cause with a fixed LLM-as-judge rubric: did the agent identify the primary hotspot, explain the mechanism, avoid misleading detours, and propose a fix that followed from the evidence? The results were even greater than expected, so why not start with the most dramatic ones:

Case study: a UI freeze the agent couldn’t find without dotTrace

One of our test scenarios was an earlier version of Avalonia that used to hang when being shut down. This issue even creeped into Rider itself 2 years ago, which illustrates how easily performance degradations from popular open source projects can pemiate your applications.

To clarify our methodology: we intentionally tested a version of Avalonia from before the fix in AvaloniaUI/Avalonia#16633. It was crucial for us to use a known, since-resolved bug because it gave us a clean reference answer for the eval.

We ran the same agent against the issue ten times with the skill and ten times without, and had an LLM judge score each diagnosis against the known root cause on a scale of 0 to 10.

Without the skill, the agent averaged 1.6 out of 10. It went looking through rendering code, listed some general suggestions, and never landed on the real problem.

With the skill, it scored 10 out of 10 on every single one of the ten runs.

Avalonia long-line freeze: the skill turns a miss into a perfect diagnosis

Ten runs with the same agent, judged against the known root cause.

Without profiling skill 1.6 / 10 Broad rendering guesses

With dotTrace skill 10 / 10 Perfect in all 10 runs

Hot pathUI thread layout into text formatting and Unicode BiDi processing.

CPU clueAbout 88% of running CPU sat in slice-indexer access beneath the BiDi rule pass.

Fix shapeMaterialize the run into a contiguous Span<T> before the rule loops.

This is the difference between “the code looks suspicious over here” and “the snapshot says the freeze is here.”

The results weren’t “somewhat better”: the agent went from reliably lost to reliably correct, with no variance.

And what it found is the kind of thing that’s genuinely hard to spot by reading code (even if you have a couple of hours to spare). The freeze wasn’t in any one obvious place. It came from a single character-by-character operation deep in the text layout path, cheap on its own but run so many times that it swallowed most of the CPU. That cost only shows up when you can see where the time actually went. The agent followed the snapshot straight to it, explained why it was so expensive, and pointed at the change that would fix it.

The full benchmark: eight scenarios, 80 runs

That Avalonia scenrio was only one slice of our evaluation. After checking that case across repeated runs, we widened the batch to eight .NET performance-investigation scenarios from different projects and compared the same agent with and without profiler access. Here’s how the skill changed the average accuracy score on each scenario, out of 10:

Batch results: more useful diagnoses, more perfect root-cause matches

Across 80 runs, the skill improves both average quality and consistency.

Average accuracy score 8.15 Baseline: 4.71

Runs scoring 8+ 59 / 80 Baseline: 29 / 80

Perfect root-cause matches 48 / 80 Baseline: 20 / 80

Scenario	Without skill	With skill	Change
avalonia-long-line	1.6	10.0	+8.4
avalonia-styles	1.6	9.9	+8.3
cyclops	2.4	8.8	+6.4
eShopOnWeb	1.0	5.2	+4.2
stock_nemo	3.0	4.0	+1.0
game-of-life	10.0	10.0	0.0
checkers-copy	9.3	9.1	-0.2
checkers-update	8.8	8.2	-0.6

The biggest wins appear where the answer genuinely lives in the runtime evidence. The flat or slightly negative cases are useful too: they show where the product should avoid invoking a heavier profiler workflow.

Across all 80 runs, the average accuracy score went from 4.71 to 8.15. The number of runs scoring 8 or higher roughly doubled, from 29 to 59. Runs that nailed the root cause exactly (a perfect 10) more than doubled, from 20 to 48.

Two things in that table we want to be honest about.

The first is where the skill earns its keep: the scenarios where the baseline was hopeless. Anywhere the answer genuinely lived in the runtime (the Avalonia freezes, the Cyclops workload), the baseline scored in the 1 to 2 range and the skill pulled it up to 9 or 10. The agent stopped spreading its attention across general optimization ideas and stayed anchored to the thing that was actually slow.

The second is the bottom of the table, and we’re leaving it in on purpose. game-of-life, checkers-copy, and checkers-update were already handled well without any profiler tooling, and on the two checkers cases the skill nudged the score down by a few tenths. The lesson isn’t that the skill hurts the results. It’s that some tasks don’t need it. Sometimes, invoking a full profiler workflow when a quick look at the code would do is just wasted tokens.

What this actually costs

We tracked cost as carefully as accuracy, because a skill that produces better answers at an unreasonable price isn’t one we’d ship. The straightforward part first: reading a snapshot is real work. The agent loads the profiler data, walks the call trees, and connects the evidence back to your source before it starts reasoning. In the 80-run batch above, that showed up in the bill. Cost went from about USD 1.91 per run without the skill to about USD 2.61 with it. Total batch cost was about USD 153 without the skill and USD 209 with it. Given how much the diagnoses improved, we think that’s a good trade, but it is a real increase, and we’d rather you hear it from us.

There’s a second effect, though, and it runs the other way. In an additional Avalonia test case, an app that was slow to shut down, the agent without the skill never found the real cause. Across ten runs it kept building the same plausible but wrong theory, searching broadly and reading file after file along the way, and scored 0 out of 10 every time. With the skill, it measured first, followed the profiler straight to the responsible code path, and scored 10 out of 10. Skipping all that wandering also made the runs cheaper and faster: USD 2.58 per run instead of USD 3.74, and 206 seconds instead of 373.

So the fair summary is that the skill changes where the money goes. It spends more on reading evidence and less on exploring dead ends. Sometimes that nets out more expensive, sometimes cheaper, but in both cases you’re paying for an answer grounded in what your application actually did, and that’s the part we think is worth it.

Cost: more evidence, fewer wrong turns

Profiler analysis has a cost, but it can also reduce broad, unproductive code search.

Scope 80-run eval batch All scenarios above

Without skill USD 1.91 / run USD 153 total

With skill USD 2.61 / run USD 209 total

About USD 0.70 more per run

Snapshot analysis increased run cost, while accuracy improved substantially.

Scope Avalonia shutdown eval 10 runs per arm

Without skill USD 3.74 / run 373 s average

With skill USD 2.58 / run 206 s average

Cheaper and faster here

Profiler evidence reduced wandering and sent the agent directly to the responsible code path.

The skill is not cheaper by default. It changes where the work happens: more evidence reading, less dead-end exploration.

What you’d see in Rider

The workflow is intentionally simple: capture a .dtp snapshot with dotTrace, whether from inside Rider, from dotTrace Standalone, or with the dotTrace command-line tool. Then ask your agent of choice in the AI Assistant tool window to investigate that snapshot by referencing its directory in the prompt.

Under the hood it loads the dottrace-analyze skill and uses the dotTrace SDK to read the profile; what comes back is a focused report:

a short summary of the dominant bottlenecks;
the methods, source locations, and call paths that own the runtime;
the root cause in plain developer language, not just a method name;
recommendations for what to look at next.

For deeper investigations it can render that into a concise HTML report you can open in a tab and share with the team. Performance work is often collaborative, and a clean artifact is very handy when it’s more than one developer working on resolving the issue.

How the agent’s search trajectories changed

The strongest signal from that second eval is the trajectory: the agent’s actual path through tools and files. The no-skill runs did not fail because they were lazy or incoherent. They searched broadly, found a real-looking timer issue, and built a confident explanation around the wrong subsystem. The skill-backed runs started with measurement, so the search space collapsed around the hot path before source reading began.

Representative trajectory: broad search vs profiler-first path

Same shutdown task, same model, same codebase. The difference is the first useful piece of evidence.

Without skill: 19 calls

2 sub-agent searches, 13 reads, 2 globs, 2 greps. No measurement, only inference.

Explore shutdown pathsFans out across dispose, shutdown, dispatcher, and timer code.
Read rendering and dispatcher filesMoves through MediaContext, render timers, application lifetime, and dispatcher loops.
Search for blocking patternsLooks for waits, joins, sleeps, and render-thread patterns.
Anchors on Win32DispatcherImpl.csSpots Now - dueTime in UpdateTimer and treats it as the decisive clue.
Corroborates the timer theoryReads dispatcher interfaces and queues to support a plausible but wrong diagnosis.

Wrong targetAll 10 no-skill runs converged on the wrong area. The answer sounded grounded, but it never reached the teardown hotspot.

With dotTrace skill: 10 calls

1 skill call, 5 profiler calls, 1 grep, 1 glob, 2 reads. Measure first, then read.

Invoke dottrace-analyzeEnters the structured snapshot workflow.
Read snapshot and timelineFinds a 7.3 s capture, one core pinned for about 5.5 s, and negligible GC.
Inspect running call treeThe largest own-time leaf is List<T>.Remove, reached via Classes.RemoveListener.
Jump to implicated sourceGreps RemoveListener, reads Classes.cs, then locates SafeEnumerableList.cs.
Confirm the root causeDrills down to recursive detach and identifies O(n^2) listener removal.

Right targetAll 10 skill-backed runs named the exact root cause. Source reading confirmed measured evidence instead of inventing the suspect list.

In this shutdown eval, the skill-backed arm also finished faster and cheaper on average: 206 s and USD 2.58 per run, versus 373 s and USD 3.74 without the skill.

Evidence beats guessing

You can summarize the whole result in one contrast. Without the snapshot, the best an agent can offer is “here are some code smells that might be slow.” With it, the agent can say “this snapshot says 88% of your time went here, and here’s why.” That second sentence is the entire point.

Guessing vs knowing

A side-by-side view of how a code-only agent and a profiler-backed agent decide where to look first.

Code-only agent

1Scans source broadly

2Finds plausible slow-looking code

3Ranks issues without runtime proof

4May miss the actual bottleneck

Profiler-backed agent

1Reads the snapshot first

2Follows hot paths into source

3Separates symptoms from cause

4Recommends a fix grounded in evidence

Performance work has a harsh requirement that most coding tasks don’t: the answer has to match what the program actually did, not what it looks like it might do. The strong runs in our evals all had the same shape: they didn’t just name a method, they explained the call path, quantified the hot region, separated the root cause from its symptoms, and proposed a fix that followed directly from the profile. That’s the shape of an expert’s investigation, and it’s only possible when the agent starts from the same evidence an expert would.

dotTrace already knows what happened at runtime. AI Assistant can already turn evidence into an explanation and a next step. The skill is the bridge between the two, and on the cases that matter most, it’s the difference between an agent that guesses and one that knows.

Available now in Rider 2026.2 EAP 8

A note on licensing: During the Rider 2026.2 Early Access Program, you can try this workflow in the EAP build for free. For regular product licensing, the profiling part of this feature relies on dotTrace. The dotTrace and dotMemory plugin in Rider is available with dotUltimate or All Products Pack subscriptions; a Rider-only subscription does not include dotTrace profiling.

This is still an early, experimental implementation for the Early Access Program, and there’s still work for us to do. Which is why your feedback is extremely important. Let us know how useful you find the reporting or where it might not be entirely reliable. Download the latest public build to try it out.

Download Rider EAP

How to Win a Hackathon: Notes From the Judging Table

Discover more

Did you know? The var keyword isn’t a keyword! It’s one of several “contextual” keywords in C#, and it only has special meaning when used to declare a variable. Try defining a class called var and see what happens to the rest of your codebase… Welcome to dotInsights by JetBrains! This newslet…

Rider 2026.2 EAP 5 is now available, bringing a faster startup flow with the new non-modal Welcome screen and quality-check hooks for AI agents. If you’re catching up on the 2026.2 EAP cycle, be sure to check out the blog posts we’ve already published about other updates unveiled so far, includin…

WPF Hot Reload is now available in Rider, starting with the 2026.2 EAP 2 build. You can edit your XAML while your app is running under the debugger and see the changes immediately, with no rebuild, no restart, and no losing your place in the application. Together with the C# Hot Reload support that'…

dotMemory Unit has long served as a unit testing framework for detecting memory issues in .NET code. We are grateful to everyone who has used it as part of their development and testing workflows. After careful consideration, we have decided to retire dotMemory Unit. The project will no longer re…

.NET Tools

Your AI Agent Keeps Missing The Real Bottleneck. JetBrains Rider Can Fix It Now.

Case study: a UI freeze the agent couldn’t find without dotTrace

The full benchmark: eight scenarios, 80 runs

What this actually costs

What you’d see in Rider

How the agent’s search trajectories changed

Evidence beats guessing

Available now in Rider 2026.2 EAP 8

Discover more

dotInsights | June 2026

Rider 2026.2 EAP 5: Code Quality Checks for Your AI Agents, and More.

WPF Hot Reload Is Here: Edit Your XAML and Watch It Update Live in Rider

Deprecating dotMemory Unit

.NET Tools

Your AI Agent Keeps Missing The Real Bottleneck. JetBrains Rider Can Fix It Now.

Case study: a UI freeze the agent couldn’t find without dotTrace

The full benchmark: eight scenarios, 80 runs

What this actually costs

What you’d see in Rider

How the agent’s search trajectories changed

Evidence beats guessing

Available now in Rider 2026.2 EAP 8

Subscribe to a monthly digest curated from the .NET Tools blog:

Discover more

dotInsights | June 2026

Rider 2026.2 EAP 5: Code Quality Checks for Your AI Agents, and More.

WPF Hot Reload Is Here: Edit Your XAML and Watch It Update Live in Rider

Deprecating dotMemory Unit