JetBrains AI
Supercharge your tools with AI-powered features inside many JetBrains products
Building AI Agents in Kotlin – Part 1: A Minimal Coding Agent
Building agents is weird. You’re not writing code that does things. You’re writing code that gives an LLM the ability to do things, and the LLM decides what to do.
That shift takes some getting used to. You give the agent the ability to read files, and it decides which files to read and when. You might expect it to start with the main file. Instead, it reads three test files first to understand the patterns. You didn’t tell it to do that. It just did.
So what abilities do you give it? Too many and it picks poorly. Too few and it can’t do the job. Finding that balance means trying things, observing what fails, and adjusting.
In this blog series, we’ll explore AI agents by building a real coding agent together, starting with three basic tools and transforming the agent into a complete one over the course of the series. You’ll learn about how agents actually work, how to observe and debug their behavior, and the architectural decisions that matter. By the end, you’ll understand not just this agent, but how to build your own. All in Kotlin.
We’ll start simple by building an agent that can navigate a codebase and make targeted changes. We’ll use Koog, an open-source framework from JetBrains that handles the execution loop: sending prompts, parsing tool calls, and executing them, repeating until done. The execution loop can get complex, but we’ll start with a basic one and focus on the more interesting question: what capabilities should your agent actually have?
Building the agent piece by piece
Before we write anything, let’s think about what we actually want. An agent that can… do what exactly?
Fix bugs? Write features? Refactor old code?
We want it to do all of those things. But what does that actually mean? It means we need an agent that can navigate a codebase and make targeted changes.
It needs to be able to see what files are in a project, read those files, and edit them or create new ones – three capabilities that we’ll give it through three tools.
These three tools are already built and bundled into Koog. We’ll look at how each one works, and then connect them all to GPT-5-Codex. That will give us a complete agent that can explore a codebase, read files, and make changes.
Teaching it to see
First, we add this tool to give our agent a way to explore:
tool(ListDirectoryTool(JVMFileSystemProvider.ReadOnly))
Now it can look around to see what files exist. But here’s the thing: If you just dump a list of file paths at an agent, it doesn’t really help. The agent needs to understand the structure.
To help with this, the tool formats its output in two ways. First, it collapses long directory chains. JVM projects have deep package structures. A typical path might be src/main/kotlin/com/example/service/impl/UserServiceImpl.kt. If the tool showed every intermediate directory, the output would be cluttered with single-child directories that add no information. So when it finds a long chain of single directories, it collapses them:
/project/src/main/kotlin/Main.kt (<0.1 KiB, 20 lines)
Second, when there are multiple files at the same level, it shows them as a tree:
/project/src/main/kotlin/ Main.kt (<0.1 KiB, 20 lines) Utils.kt (<0.1 KiB, 10 lines)
Notice the file sizes and line counts are right there. Why show them?
Without them, you’ll run into a problem we hit. When we first built this, we didn’t include file sizes. The agent would explore the directory and see filenames, but have no idea about their size. So when it needed to understand a file, it would just read the entire thing.
Then we hit a real case: the agent tried to read a Python test file that was over 7,700 lines long. The LLM’s context window filled up with just that one file. When the agent tried to continue working, it had lost track of other files it had read, what it was supposed to be doing, and what changes it needed to make. The single massive file had pushed everything else out of context.
That’s why line counts and file sizes are in the directory listing. Now, when the agent explores the codebase, it sees that test_dataset.py is 7,700 lines before trying to read it. It can decide to read just specific sections instead of the whole thing, or skip it entirely if it’s not relevant to the task.
Reading code
Next, the agent needs to read files:
tool(ReadFileTool(JVMFileSystemProvider.ReadOnly))
See JVMFileSystemProvider.ReadOnly? That’s because this tool doesn’t call java.nio.file APIs directly. It’s built against Koog’s FileSystemProvider interface.
Why? So you can write a tool once and use it anywhere. ReadFileTool just calls methods on the provider. It doesn’t care if that provider reads from your local disk, from files over SSH, or from cloud storage. If you pass it to a different provider, the tool works the same way. If you’re building your own file tools, you can implement FileSystemProvider for your storage backend and pass it to Koog’s file tools, and they’ll work.
We’re using JVMFileSystemProvider.ReadOnly here. Why is it ReadOnly? By separating read and write permissions at the provider level, you can compose agents with different permission levels. Need an agent that only explores and analyzes code? Give it tools that use ReadOnly providers, and you know for certain it can’t modify anything. For this agent, we’ll add modification capabilities next. EditFileTool will use JVMFileSystemProvider.ReadWrite instead. Clear separation means you control exactly what each agent can and can’t do.
The tool also supports line ranges:
readFile("Main.kt", startLine = 45, endLine = 72)
Remember those line counts in ListDirectoryTool? This is why. When the agent sees a file with thousands of lines in the directory listing, it knows to be more strategic about reading it. It might read the file in chunks, avoid rereading it multiple times, or extract specific sections. This helps prevent context exhaustion from large files like that 7,700-line Python test file example.
Modifying code
Now we give the agent the power to actually change things:
tool(EditFileTool(JVMFileSystemProvider.ReadWrite))
EditFileTool does find-and-replace. Give it a file path, the text to find, and what to replace it with. Simple enough.
But here’s what’s interesting: Those three parameters can do different things, depending on what you pass.
Want to replace text? Specify the original text and the replacement:
edit_file(
path = "Main.kt",
original = "Hello World!",
replacement = "Hello Koog!"
)
Want to delete something? Just make the replacement empty:
edit_file(
path = "Main.kt",
original = "Hello World!",
replacement = ""
)
Want to create a whole new file? Make the original empty:
edit_file(
path = "Main.kt",
original = "",
replacement = "fun main() { println(\"Hello World!\") }"
)
Same tool. Three different operations.
Learning from failures
The tool works, but we discovered something when testing with real agents. After several edits, the file was different. The agent tried to make another edit, but the text it was looking for didn’t exist anymore. Earlier edits had changed it.
The tool was returning “Edit failed”. We figured the agent would re-read the file and try again. Want to guess what happened instead?
The agent saw “Edit failed” and thought, “Something’s wrong with this file. Maybe I don’t have permission? I’ll just create a new file with my changes!” Suddenly, there were new files everywhere. Code that should have been edited ended up in new files instead.
The fix was simple. We just needed to tell the agent what actually went wrong. We changed the error message to “The original text to replace was not found in the file content. Consider re-reading the file to check if the original has changed since last read”. And that worked. The agents learned: “Oh, text not found, I should re-read and try again.”
Putting the pieces together
We have three tools. Now we need to connect them to GPT-5-Codex.
If you were building this from scratch without a framework, you’d write an execution loop: Send a prompt to GPT-5-Codex, parse the response, check if it wants to call a tool, execute that tool, send the results back, repeat. You’d need to handle API requests, error cases, token limits, and the tool calling protocol. That’s a lot of setup and effort.
Here, Koog’s AIAgent handles that loop. You give it an LLM, tools, and instructions:
val agent = AIAgent(
promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")),
llmModel = OpenAIModels.Chat.GPT5Codex,
This connects the agent to GPT-5-Codex. promptExecutor is Koog’s way of handling LLM communication. You give it an API key and model, and it takes care of the rest.
Now tell it about your tools:
toolRegistry = ToolRegistry {
tool(ListDirectoryTool(JVMFileSystemProvider.ReadOnly))
tool(ReadFileTool(JVMFileSystemProvider.ReadOnly))
tool(EditFileTool(JVMFileSystemProvider.ReadWrite))
},
ToolRegistry is where you list your tools. Koog takes each tool, converts it into the JSON format GPT-5-Codex expects, and handles calling them when GPT-5-Codex requests it.
Give it instructions:
systemPrompt = """
You are a highly skilled programmer tasked with updating the provided codebase according to the given task.
Your goal is to deliver production-ready code changes that integrate seamlessly with the existing codebase
and solve given task.
""".trimIndent(),
This is the system prompt that tells the agent what its job is. Notice “updating the provided codebase”? That’s what makes the agent actually modify files instead of just explaining what should change.
Tell it how to run:
strategy = singleRunStrategy(),
maxIterations = 100
The strategy is the agent loop. singleRunStrategy() is Koog’s built-in basic version. The agent calls tools until it decides the task is complete. We’ll explore how to write complex custom strategies in future articles, but this one works for most tasks.
But what if it never stops? We once had an agent read a 7,000-line file, hit the token limit, lose context, and then read the same file again – over and over. Luckily maxIterations = 200 stopped it before burning through our API budget. After 200 steps, the agent quits. Each tool call costs 2 steps (one for the call, one for the response), so that’s 100 calls max.
Sometimes the agent will run for a bit, leaving you sitting there wondering if it’s working or stuck. We want to add some visibility to the agent, so we can see what it’s doing. The handleEvents block is how you watch what’s going on in Koog:
{
handleEvents {
onToolCallStarting { ctx ->
println("Tool called: ${ctx.tool.name}")
}
}
}
Koog gives you different events to hook into, like onToolCallStarting, onToolCallFinished, onAgentFinished, and others. We’re using onToolCallStarting here, which runs right before each tool executes, so we can log what’s about to happen.
You’ll want to tune what you log, though. The first time we did this, we printed everything. EditFileTool scrolled hundreds of lines past us. Then we tried just the tool name. EditFileTool ran three times, and we had no idea what it was editing. We print 100 characters now. You can follow what’s happening without staring at too much text.
Running our agent
The agent is built. Now we need a way to actually use it:
fun main(args: Array<String>) = runBlocking {
if (args.size < 2) {
println("Error: Please provide the project absolute path and a task as arguments")
println("Usage: <absolute_path> <task>")
return@runBlocking
}
val (path, task) = args
val input = "Project path: $path\n\n$task"
val result = agent.run(input)
println(result)
}
The agent needs to know two things: where the project lives and what to do. With this block of code, we grab both from command line arguments. If they’re missing, we show an error. Otherwise, we combine them into one input string and call agent.run().
The agent takes over from there. It explores the project, reads files, and makes changes, looping until done. When it finishes, we print the result.
Here’s what it looks like in action:
Does it work?
Once we had the agent running, we wanted to see what these three tools could actually accomplish. So we tested it on SWE-bench Verified, a benchmark of 500 coding tasks from GitHub issues.
It completed around 50% of them – not bad for 50 lines of code.
But the agent never verifies anything. Can’t compile. Can’t run tests. It reads code and makes changes based on what it sees, but has no way to know if those changes actually work.
And we’ve used three tools without showing you how to build one. You know what they do, but you don’t know how they work or how to make your own.
That’s what my colleague Bruno shows you in the next article of this series. He builds a shell execution tool from scratch so the agent can run commands, see the output, and learn from what breaks. And he walks through how Koog tools work, because 50 lines of code only matter if you can extend the agent with tools for your specific problems.
The code’s on GitHub. Try it and share your experience with us.