Company

Visit jetbrains.com

AI News

The Launch of Developer Productivity AI Arena: An Open Platform for Benchmarking AI Coding Agents

Arun Gupta

Read this post in other languages:

For 25 years, JetBrains has shaped the software development landscape across multiple programming languages, advancing how developers and organizations build software. Our focus has always been on improving productivity and the overall developer experience.

With the rise of AI, a key challenge now is understanding how to measure the real-world productivity gains provided by AI-assisted tools. In an attempt to respond to this challenge, JetBrains decided to create the Developer Productivity AI Arena (DPAI Arena) and ultimately contribute it to the Linux Foundation.

“As AI coding agents become integral to modern software development, the industry urgently needs a transparent, trusted way to measure their real impact on developer productivity. DPAIA establishes an open, vendor-neutral framework for evaluating AI-assisted development across languages, frameworks, and environments.

We invite coding agent and framework providers to join this effort and help shape the benchmarks that define the next era of software creation. And we encourage end users to participate by validating AI tools on their real-world workloads, ensuring the ecosystem advances on a foundation of openness, trust, and measurable impact.”

Mark Collier General Manager, AI & Infrastructure, Linux Foundation

DPAI Arena is the industry’s first open, multi-language, multi-framework, and multi-workflow benchmarking platform designed to measure the effectiveness of AI coding agents for real-world software engineering tasks. Built around a flexible, track-based architecture, it enables fair, reproducible comparisons across various workflows, such as patching, bug fixing, PR review, test generation, static analysis, and more.

Benchmarking redefined

Current benchmarks rely on outdated datasets, cover a narrow range of technologies, and focus too narrowly on issue-to-patch workflows. As AI coding tools advance rapidly, the industry still lacks a neutral, standards-based framework to measure their real impact on developer productivity.

“JetBrains has spent more than two decades building tools that help tens of millions of developers think critically, write code confidently, and innovate at pace. This gives us a unique understanding of the potential and pressure AI is placing on the software development world now. We see firsthand how teams are trying to reconcile productivity gains with code quality, transparency, and trust – challenges that take more than performance benchmarks to address.
The Developer Productivity AI Arena is designed to bring clarity and accountability, to evaluate and improve AI coding agents consistently and collaboratively, and to help the industry see and even measure the difference between AI that just accelerates work and AI that truly understands and facilitates it. By defining a shared framework for benchmarking AI agents, we aspire to promote transparency and trust across the AI system. ”

Kirill Skrygan CEO, JetBrains

DPAI Arena fills this gap through transparent evaluation pipelines, reproducible infrastructure, and extensible, community-driven multi-track datasets.

Measuring what matters

DPAI Arena brings measurable productivity into the world of AI-assisted software development. AI tool providers can benchmark and refine their tools on real-world tasks, technology vendors keep their ecosystems first-class by contributing domain-specific benchmarks, enterprises gain a trusted way to evaluate tools before adoption, and developers get transparent insights into what truly boosts productivity.

DPAI Arena is built to empower everyone to contribute to the future of AI coding. The platform’s first benchmark, Spring Benchmark, introduces the technical standard for future contributions. First, it implements the guidelines for dataset creation, also detailing supported evaluation formats and general rules. Second, it provides a solid base for decoupled infrastructure, enabling anyone to bring their own dataset (BYOD approach) and reuse the infrastructure for their own evaluations.

We are also looking at the Spring AI Bench to extend the Java benchmarking stream in DPAI Arena, working closely with the project’s core team to drive more variability and multi-track benchmarks across the Java ecosystem.

Join DPAI Arena

We intend to contribute this project to the Linux Foundation, which will then establish a diverse and inclusive Technical Steering Committee to determine the future direction of the platform.

Follow the platform’s progress at https://dpaia.dev/. For more information, please refer to our Project Overview or GitHub org.

Introducing Developer Productivity AI Arena: An Open Platform for AI Coding Agents Benchmarks JetBrains Becomes Cloud9’s Official AI-Powered Coding Partner

Discover more

Collaboration combines the global software leader’s development expertise with the free zone’s innovation ecosystem to advance responsible AI adoption.

In today’s software development landscape, AI coding agents are evolving rapidly from prototypes into indispensable tools. Yet the industry still lacks an open, neutral, standards-based way to measure the real-world effectiveness of these agents across languages, frameworks, and environments. Wit…

TL;DR Trotz der bemerkenswerten Fortschritte in den letzten Jahren erfüllen KI-Systeme nicht immer die Anforderungen von Berufsentwickler*innen. Ein wesentlicher Grund dafür ist, dass die meisten Modelle mit öffentlichen Datensammlungen trainiert wurden, die nicht die komplexen Praxisszenarien wider…

TL;DR AI는 지난 몇 년에 걸쳐 눈부시게 발전했지만 전문 개발자들에게 필요한 기능을 모두 제공하지는 못합니다. 대부분의 모델이 공개된 데이터세트를 기반으로 학습되지만, 이러한 데이터세트는 전문 개발자들이 매일 직면하는 복잡한 실제 상황을 반영하지 못하기 때문입니다. 실무적 데이터 없이는 AI 도구가 더 완벽해질 수 없습니다. 이를 개선하기 위해 업계의 다른 기업과 마찬가지로 JetBrains도 실제 사용 사례를 학습의 기초로 삼아야 합니다. 이에 사용자 여러분께 도움을 요청드리고자 합니다. 구체적인 방법은 다음과 같…

Company