JetBrains Research

Research is crucial for progress and innovation, which is why at JetBrains we are passionate about both scientific and market research

About JetBrains Research

News Research

The Impact and Achievements of AI4SE in 2025

Katie Fraser Vladimir Kovalenko

AI is reshaping how software is built, tested, and taught. In AI for Software Engineering (AI4SE), that transformation is engineered on purpose. A research partnership with Delft University of Technology (TU Delft), AI4SE brings together leading labs, industry-grade tools, and five dedicated research tracks to turn advances in AI into practical gains for developers and learners worldwide.

In this post, we will tell you about the last year’s achievements, broken down by track.

AI4SE overview

“This collaboration between the top IDE developer and a top European uni is super exciting.”

Ziyou Li

AI4SE was launched in late 2023. Five PhD students and their supervisors from TU Delft work together with JetBrains researchers, as well as BSc and MSc students, on topics exploring AI in software engineering. The topics fall within the following five research tracks:

Testing and Evaluating LLMs and SWE Agents

This track explores how autonomous AI agents, leveraging the reasoning capabilities of large language models (LLMs) and multi-agent orchestration, assist developers in coding, testing, and automating workflows. The research in this track seeks to maintain the safety, robustness, reliability, and long-term maintainability of LLM-powered agents.

Large Language Model Adaptation for Coding Tasks

This track’s goal is to adapt and personalize LLMs for different IDE users’ coding tasks, evaluating the LLMs’ emerging capabilities and ensuring that the outputs are timely, safe, and relevant. The research in this track aims to overcome issues such as those associated with massive training data from public domains and reliance on one model only.

Interactive and Aligned IDEs in the LLM Era

This track aims to build simple, non-disruptive tools that bring AI-powered code generation and explanation directly into developers’ existing IDE workflows, going beyond chat-based interfaces to make everyday coding easier and more productive.

Utilizing Runtime Information to Improve Development Processes

Mid-2025, this track’s team decided to pivot to focus more on agentic and multi-agentic systems to explore how dynamic analysis techniques can assist in AI engineering and AI observability.

Intelligent Teaching Assistant in Programming Education

This track aims to build a smart AI teaching assistant that gives students personalized, context-aware help with programming—from better code generation and tailored hints to clear, metaphor-based explanations and custom learning materials—so they can understand concepts more easily and reach their learning goals more effectively.

Several of our teams participate: Software Testing Research, HAX, Dynamic Program Analysis, Education Research; alongside teams from the TU Delft side: AISE and CISE research labs from the Software Engineering Research Group (more details on who is involved can be found on the AI4SE People page). The program additionally is part of the Innovation Center for Artificial Intelligence (ICAI).

AI4SE in 2025: Highlights

“Being here at one of the well-established and leading “for developers” companies in the industry during this rapid time of change comes with some chaos, but also with opportunity to do work which will influence the state of development worldwide.”

Sergey Datskiv

In this section, we want to showcase specific achievements of AI4SE in 2025. We begin with a student project that has significant impact, then present milestones of our graduate students, followed by individual highlights per track: tools and plugins, product support, and standout publications (our running list of publications can be found here).

Impact showcase

“It was both challenging and rewarding to train ML models on JetBrains’ production-scale data. With the help of some talented engineers, we rolled out trigger models which now save 20-30% of code completion inference.”

Aral de Moor

We would like to highlight a research project that has made a significant impact far beyond the research environment. Namely, Aral de Moor’s work with Maliheh Izadi, Arie van Deursen, and Sergey Titov on trigger models originated as Aral’s student project, and has been very successful in its application since then.

Aral and his team built a machine learning model that uses code context and telemetry data to predict the optimal moment to trigger code completion, boosting developer productivity. They studied real-world interactions to finetune the model so that it filters out completion requests that are unlikely to be accepted. This model is able to avoid generating completions about a third of the time with non-significant impact on other completion metrics – while significantly saving inference cost. While originally available for Kotlin only, Aral and his team has rolled out this feature for every programming language to all JetBrains IDEs.

Already at AIware 2024, the paper on this work won the Distinguished Paper Award, and Aral’s newer paper about the models has been accepted to the IDE Workshop co-located at ICSE 2026.

2025 MSc graduates

“In this past year, it’s been exciting to have seen and ultimately become a part of the transition from JetBrains Fleet into the foundation for a new product that is centered around the flow of delegating coding tasks to async agents: Air.”

Nadine Kuo

Six MSc students who worked on their theses as JetBrains interns have graduated in 2025. Here are the names of the MS students, with their thesis titles and links, available for you to read:

Arnav Chopra, Building Better Programmers: An AI System for Guided Program Decomposition
Sergey Datskiv, Prompt, seed, generate: Seeding for test case generator with LLMs

Casper Robert Dekeling, Comparing the hint quality of a Small Language Model and a Large Language Model in automatic hint generation
Milan de Koning, Metamorphic testing for LLM-based code repair

Nadine Kuo, Proactive AI in IDEs
Saga Rut Sunnevudóttir, The Status of JavaScript Test Generation: A Benchmark-Based Evaluation

Four of these students (Sergey, Milan, Nadine, and Saga) have started working full-time at JetBrains since graduation. We featured their journey in posts this summer (parts I and II).

PhD Students who passed the Go/NoGo milestone

Three PhD students successfully passed their Go/NoGo in 2025. This is an important milestone in PhD programs in the Netherlands, although details differ across institutions and faculties in when and how it occurs. While the student is regularly meeting with their supervisor in the time leading up to the Go/NoGo meeting, this meeting is a more formal one, and it requires the student to submit a project plan with expected findings. At the meeting, a committee evaluates whether the PhD student is likely to successfully complete their thesis in time and then shares recommendations for the student, as well their decision as Go, or ‘continue the student’s project’, or NoGo, or ‘terminate the student’s project prematurely’.

These students who successfully passed the Go/NoGo milestone and their thesis projects are:

Daniele Cipollone, Model Adaptation to Coding Tasks
Ziyou Li, Interactive and Aligned Agentic IDE

Yuri Noviello, Designing and Evaluating AI-Generated Visual Analogies for Computing Education

Publications and other achievements

“The cool thing about JetBrains in general and AI4SE is that it’s so multi-faceted that it’s really easy to come up with cross-disciplinary ideas and actually get to work on them.”

Roham Koohestani

Here are some of the most important of AI4SE researchers’ achievements in 2025:

Track 1: Testing and Evaluation of LLMs and SWE Agents

Milan de Koning has been working on data leakage, or when a model sees parts of the test data during training. Specifically, he studies how metamorphic testing, which changes code without altering its meaning, can reveal when models rely on memorization rather than true understanding, and applies this to AI agents.

Track 2: Large Language Model Adaptation for Coding Tasks

Daniele Cipollone developed TreeRanker, a fast and architecture-agnostic approach using a token-aware ranking system for code completion. In addition to the ranking method, this project introduces a new dataset for evaluating completion ranking based on the Long Code Arena benchmark. Daniele presented TreeRanker in the industry track of ASE 2025, gave the Doctoral Keynote at the Code Completion Challenge (part of ASE 2025), gave a talk on integrating LLMs in IDEs at the doctoral symposium of the FSE 2025 conference, and one on code vulnerabilities and automating their detection at the LLM4Code workshop at ICSE 2025.

Track 3: Interactive and Aligned IDEs in the LLM Era

Nadine Kuo collaborated with Agnia Sergeyuk and Valerie Chen (ML PhD Student, Carnegie Mellon University) on ProAIDE, a prototype built in JetBrains Fleet to investigate how developers interact with in-IDE proactive AI interventions. Through a five-day in-the-wild study with professional developers, they investigated when and how such suggestions are most effective at enhancing code quality across the development workflow. This work was accepted to ACM IUI 2026 as a full paper.
Nadine has since continued to contribute to the evolution of Fleet into the foundation for a new Agentic Development Environment (JetBrains Air), that brings together OpenAI Codex, Claude Agent, Gemini CLI, and Junie with unified orchestration, letting developers delegate parallel development and review tasks (whether in isolated Docker containers, separate Git worktrees, or soon in the cloud).

Ziyou Li collaborated with Agnia on Prompt-with-Me, a prompt library that turns scattered prompts from IDE users into a clean, reusable, and context-aware in-IDE prompt library. He developed the prototype and tested Prompt-with-Me with a dozen developers in different industries with successful results. Their paper detailing this was presented in the industry track of ASE 2025.

Further work by Ziyou concerned a high-level design of an agent-enabled IDE and the roadmap to implement different aspects of it; the paper was presented at the the FSE 2025 doctoral symposium. Ziyou also presented a short paper with Maliheh Izadi at the International Workshop on Envisioning the AI-Augmented Software Development Life Cycle of the FSE 2025 conference. This paper proposes a mediator agent to interface between the developer, the IDE and its tools, agentic tools, and external systems.

Agnia’s work with Maliheh on the human-AI experience in IDEs was accepted to the Empirical Software Engineering journal at the end of last year and published in early January 2026. On top of that, their work on developers’ needs with respect to AI assistants in IDEs was accepted for the Software Engineering in Practice (SEIP) track at ICSE 2026.

Roham Koohestani has been working with Maliheh on projects which resulted in two papers, presented at ICSE 2025 and at FSE 2025: The first proposes hyper-dimensional vector spaces to model human-computer Interaction, focusing on user actions, stylistic preferences, and project context. The second introduces HyperSeq, or Hyper-Adaptive Representation for Predictive Sequencing of States, a novel, resource-efficient approach designed to model developers’ cognitive states.

Track Crossover: 3 and 4

Roham has been collaborating with Ateş Görpelioğlu on AgentGuard, a framework for runtime verification of agentic AI systems, which he presented at the AgenticSE workshop held at ASE 2025. His work on AgentGuard has sparked a cross-track collaboration between Tracks 3 and 4. The project also initiated discussions with the Koog team to explore potential integration of AgentGuard into the Koog framework.

Track 4: Utilizing Runtime Information to Improve Development Processes

Zahra Seyedghorban, in collaboration with Yelizaveta Brus (MSc student, UWaterloo, REBELs research group), worked on the Test Error Grouping for Asgard project to investigate how crash deduplication techniques can help cluster similar test failures. They adapted two state-of-the-art approaches: (1) FaST, a term-based method that aligns stack traces to measure lexical similarity, and (2) BERTopic, an embedding-based topic-modeling approach that captures semantic similarity in failure description.
Ateş identified issues related to tool-calling functionality when using OpenRouter models within the Koog framework. He also discovered missing capabilities in Koog’s OpenTelemetry support and contributed to resolving these issues by collaborating closely with engineers from JetBrains.
On automated testing of microservice architectures, Delano Flipse, Hakan Simsek, Jeremie Decouchant, and Burcu Kulahcioglu Ozkan’s paper “Automated Network-Level Fault Injection Testing of Microservice Architectures was accepted to the research track of ICSE’26. The method models the system’s resilience behaviors dynamically through the observed test executions and uses this information to generate the set of fault combinations to explore.

Burcu gave the conference keynote talk, From Formal Methods to Testing of Distributed Systems, at FORTE’25, the 45th International Conference on Formal Techniques for Distributed Objects, Components, and Systems.

Track 5: Intelligent Teaching Assistant in Programming Education

Gosia Migut, Anastasia Birillo, and Yuri Noviello have worked on AI-generated metaphors, a tool which extracts coding concepts from task descriptions. It generates visual and text-based metaphors to explain these concepts to students.

Looking Ahead in 2026

“Happy to be part of the AI revolution in software engineering with JetBrains!”

Daniele Cipollone

AI4SE turned ambitious ideas into real 2025 impact. From smarter testing and code completion to agentic IDEs and teaching aids, our researchers delivered real tools evaluated and backed by peer-reviewed research. With new graduates and a growing network of collaborators, the program is entering 2026 as a proven foundation for fostering the growth of emerging researchers and as a solid engine for reshaping how software is built and learned.

If you are a TU Delft student interested in joining AI4SE, contact Mitchell Olsthoorn for general questions about the thesis procedures, or reach out to the university track leads to learn about project opportunities.

How Students Are Using AI-Powered Hints in Programming Courses Comparative Analysis of Development Cycle Speed in Java and Kotlin Based on IDE Telemetry Data

Discover more

Our ROI (return on investment) calculator was designed to help estimate the potential gains of JetBrains IDEs and AI Ultimate subscriptions.

In training agents, we toss the whole run if the final outcome is imperfect, missing out on valuable info. To fix this, we developed Step Rejection Fine-Tuning.

AI coding assistants are no longer shiny add‑ons: they are standard parts of our daily workflows. We know developers’ perspectives on them in the short term, and many say they get more done and spend less time on boilerplate or boring tasks. But we know far less about what happens over years of real…

Does the choice of programming language affect how quickly developers deliver code? This article offers data-driven insights.

JetBrains Research

The Impact and Achievements of AI4SE in 2025

AI4SE overview