JetBrains AI
Supercharge your tools with AI-powered features inside many JetBrains products
Why and How JetBrains Built Mellum – the LLM Designed for Code Completion
Hey, I’m Anton, product manager at JetBrains AI, and I managed release of the JetBrains’ first proprietary LLM. 👋
Last year, we launched Mellum, a project I’m incredibly proud of. Mellum isn’t just another LLM – it’s a reflection of JetBrains’ commitment to helping developers work smarter and more efficiently. Today I’d like to share the behind-the-scenes story of building Mellum and hurdles I wish we knew how to properly handle.
Why did we build a proprietary LLM? And why just for code completion?
Let’s start with the big question – why Mellum? We wanted to create something that truly understands and supports developers.
General-purpose AI models, while impressive and powerful, often miss the mark when it comes to specific tasks like code completion. They’re big, at times slow, and designed for everything, so they may lack precision and relevance. That’s where Mellum comes in.
Then why just for code completion?
At JetBrains, we had already developed expertise in training smaller models for code completion. So, launching Mellum was a natural progression for us. While the scope of tasks remained limited, Mellum focused on delivering the most critical features that every modern, AI-powered assistant should offer. It was not easy though! The model’s larger size brought new technical challenges, pushing us to expand our capabilities.
We built Mellum to:
- Make our AI-driven code completion feature excel. Mellum is laser-focused on providing fast, accurate code completions. It’s not trying to solve every problem under the sun – it’s built to be great at one specific task that is really fundamental to development.
- Expand JetBrains’ expertise in AI. Building Mellum wasn’t just about the immediate benefits, it was also about gaining the skills and infrastructure to tackle future challenges and make JetBrains’ products even better.
The challenges we faced while training the LLM
Developing Mellum was no walk in the park. One of the biggest hurdles was training Mellum on massive datasets using GPU clusters – an entirely new frontier for JetBrains. Navigating this territory required overcoming a steep learning curve, adopting new technologies, and refining our processes to meet the demands of large-scale model training.
Here are some of the key challenges we tackled:
Training models at scale
We moved from training small models on single servers to distributed training across huge GPU clusters. This shift involved setting up the infrastructure, adopting new technologies in the company, navigating the complex process of vendor selection, and building a trusting business relationship with the cluster provider.
Working with data
Good data is everything when training an AI model. For Mellum, we processed vast amounts of code to ensure consistency and relevance while remaining attentive to legal compliance.
In addition to using well-known evaluation metrics, we enhanced them to better reflect real-world developer scenarios. We also applied our evaluation process at scale, enabling us to measure quality across a diverse range of code examples.
Balancing quality and speed
Developing an LLM for real-world product use inevitably involves navigating the tradeoff between quality and performance. On the one hand, larger models tend to produce better outputs. On the other, they come with increased costs and slower performance. For code completion tasks, this trade-off has a huge impact – if a code completion tool is slow, it’s unusable. At JetBrains, we are aiming to build sustainable products, so finding the sweet spot was a crucial goal for us.
What could we have done differently?
This ride wasn’t all unicorns and rainbows, and we definitely made some mistakes along the way. If I were able to travel back in time and tackle this project again, I’d do the following things differently:
- Making key technical decisions earlier. Developing a demanding feature like AI-driven code completion involves extensive R&D, requiring significant trial and error to solve specific challenges and enhance quality. Having a clearer understanding of the most effective approaches from the start would have allowed us to move much faster and achieve even better results.
- Embracing a larger cluster size sooner. When we first started training Mellum, our cluster was four times smaller than the one we were using when Mellum was released. Had we possessed deeper expertise in LLM training from the outset, we could have experimented at a faster pace. Moreover, by anticipating our larger hardware needs earlier, we could have negotiated a more favorable pricing model with our providers.
- Designing client-server architecture better. While it’s crucial to invest time in designing a complex project, the fast-paced nature of developing AI-driven features means that users are constantly coming up with new input, making it nearly impossible to anticipate everything in advance. With the knowledge we have now, we could have designed the client-server interaction to be even more effective, ensuring greater quality.
What makes Mellum special
Mellum’s edge lies in its meticulous handling of data. The way we process and format data, ensuring consistency during both the training and application phases, sets it apart. This approach guarantees that the model receives data in the exact structure it was trained on, overcoming a significant technical challenge. This precision is a key reason why Mellum performs so effectively and reliably.
AI Assistant users have been benefiting from our custom LLM since the 2024.2 release, and there’s a mind-blowing difference between the code completion results from our model versus third-party models. 👇
While Mellum excels in code-specific tasks, it’s not a replacement for general-purpose LLMs. Broader models may be better suited for tasks that go beyond coding, such as drafting documentation or addressing open-ended queries. We believe that Mellum works best when used alongside other tools, giving developers the flexibility to choose what fits their needs.
Where we’re headed
JetBrains has always been about empowering developers, and Mellum is a big part of that. Moving forward, we’re going to expand Mellum’s scope and add other features that might go beyond code-to-code tasks.
A call for your feedback in the comments
If you’ve used code completion in AI Assistant since August, you’ve experienced Mellum in action. I’d love to hear your thoughts or any feedback on the quality. Also, what code completion qualities do you value most? Feel free to share your thoughts!
 
                                                                                                 
                 
                                                     
                 
                                                     
                 
                                                     
                