JetBrains AI

Supercharge your tools with AI-powered features inside many JetBrains products

Explore More

Events JetBrains AI News Research

Small Models, Big Impact: Why JetBrains is Betting on Focal LLMs

Conrad Schwellnus

At AI Summit London 2025, Kris Kang, Head of Product for AI at JetBrains, gave a talk that questioned a common belief in AI development: that bigger means better.

The industry has focused heavily on massive, general-purpose language models. These models offer impressive capabilities, but the cost of building and running them at scale is top of mind for many enterprise decision-markers.

In his talk “Small Models, Big Impact”, Kris introduced an alternative: focal models. These compact, domain-specific LLMs aim to deliver strong performance while reducing energy use and total cost of ownership. These models can be used as a workhorse complement to frontier models.”

Here’s why this matters and how JetBrains is acting on it.

The energy cost of chasing scale

AI models now operate at a scale that was hard to imagine a few years ago. Although there are no official numbers, observers estimate that GPT-3 has 175 billion parameters, while estimates place GPT-4 at 1.8 trillion and Grok-3 at 2.7 trillion. To accommodate the growing size of models, data centers with as much as 2 million GPUs are being built.

The energy required for this is significant. Processing 1 billion AI chats, with a frontier model, can use more than 124 GWh per year (at 0.34 Wh per day), or equivalent to powering 31k UK households per year. The environmental impact alone raises important concerns, and the financial cost can be just as significant.

For a mid-sized company, using a frontier model daily might cost around USD 15,000, which adds up to USD 45,000 per employee per year. That figure is six times higher than the average IT expenditure per employee in 2023. This assumes: $15 per 10 requests, and an enterprise of 1,000 makes 10,000 requests per day, then that’s $15,000 per day. This is a conservative estimate.

What are focal models?

Focal models aim to solve a single problem with scalability in mind. They are:

Small, usually with fewer than 10 billion parameters
Domain-specific, not general-purpose
Post-trained to maximise cost efficiency and inference speed
Built using techniques like quantisation, distillation, and Mixture of Experts techniques

Rather than trying to be good at everything (generating images, text-to-speech, coding, and more) they focus on high performance in a narrow field, such as code, legal, or medical tasks.

This specialisation allows enterprises to apply focal models to specific use cases that have the highest ROI per cost, where ROI might also include energy efficiency.

Mellum: A focal model purpose-built for code

JetBrains has taken this concept and is applying it directly to software development. Mellum, our 4-billion-parameter model, is the first in a family we intend to open source. It was built from the ground up to support code completion in real-world settings.

Mellum works across multiple programming languages and has been optimized to run quickly on modest hardware (e.g. it can run locally easily on a Macbook M4 Max). Developers can use it in isolated environments without relying on third-party providers and without giving up control over their data. You can already access Mellum through JetBrains IDEs via our AI Assistant or on Hugging Face if you prefer to run it independently.

Rather than aiming for general use, Mellum stays focused on one job: helping developers write better code more efficiently.

Why focal models matter for the future of AI

Focal models offer a practical answer to AI’s rising technical, financial, and environmental costs.

By narrowing the model’s purpose, teams can work with systems that cost less to operate and offer greater flexibility in how and where they’re deployed.

This also opens the door to deeper enterprise integration. Companies can fine-tune these models to fit their workflows or adapt them to meet compliance requirements without needing access to large-scale computeing clusters.

Focal models do not replace frontier models but offer a complement that better suits many real-world applications.

The JetBrains perspective: Smarter, not just bigger

JetBrains believes the next meaningful shift in AI will not come from chasing larger models but from refining how and where we apply them in order to help companies grow their businesses sustainably. We have invested in focal model development because our customers are asking for it, and we believe this approach delivers more balanced results.

Mellum is our first step in this direction. It shows what is possible when you focus on purpose instead of size.

Try Mellum today

You can try Mellum in several ways:

On Hugging Face, for experimentation or offline use
Inside JetBrains IDEs, through AI Assistant
Soon, as a containerised deployment in NVIDIA AI Enterprise

Whether you prioritise sustainability, performance, or security, focal models like Mellum offer a practical way to adopt LLMs where they bring the most value.

The future of AI is not just about using frontier models as they hit the market, it’s about sustainably embedding AI to your business that leads to positive outcomes.

The Future of AI in Software Development When Tool-Calling Becomes an Addiction: Debugging LLM Patterns in Koog

Discover more

This summer, we're making Mellum, our code completion model, easier for developers to use locally, and we’re introducing support for more languages than ever.

I was testing my agent built on Koog, JetBrains' open-source framework for building AI agents in Kotlin. I fed it a task from SWE-bench-Verified, a real-world GitHub issue that tests whether AI can actually write code. For the first 100 messages, everything looked promising. The agent methodicall…

AI is no longer a distant idea. It’s already here and changing how we build software. As it advances, new questions emerge about its impact.

We’ve just released Koog 0.3.0, which comes with many updates that make building, running, and managing intelligent agents easier. This version focuses on durability, speed, observability, and smoother integration with real-world systems. If you’ve been exploring how to develop your own intellige…

JetBrains AI

Small Models, Big Impact: Why JetBrains is Betting on Focal LLMs

The energy cost of chasing scale

What are focal models?

Mellum: A focal model purpose-built for code

Why focal models matter for the future of AI

The JetBrains perspective: Smarter, not just bigger

Try Mellum today

Discover more

What’s New with Mellum: Expanded Language Support and New Ways to Use it Locally

When Tool-Calling Becomes an Addiction: Debugging LLM Patterns in Koog

The Future of AI in Software Development

Building Better Agents: What’s New in Koog 0.3.0

JetBrains AI

Small Models, Big Impact: Why JetBrains is Betting on Focal LLMs

The energy cost of chasing scale

What are focal models?

Mellum: A focal model purpose-built for code

Why focal models matter for the future of AI

The JetBrains perspective: Smarter, not just bigger

Try Mellum today

Subscribe to JetBrains AI Blog updates

Discover more

What’s New with Mellum: Expanded Language Support and New Ways to Use it Locally

When Tool-Calling Becomes an Addiction: Debugging LLM Patterns in Koog

The Future of AI in Software Development

Building Better Agents: What’s New in Koog 0.3.0