Powerful CI/CD for DevOps-centric teams
In this tutorial, we show how to use TeamCity and SWE-bench to build an evaluation pipeline for systematically testing our coding agent Junie on real-world development tasks.
How do you measure progress when building an AI coding agent? The Junie team at JetBrains uses TeamCity to evaluate agent performance at scale. Learn more from this case study.