Databao
Agentic platform with modular AI tools and a governed semantic layer for any data stack
Introducing Databao: The JetBrains Tool That Lets You Talk to Your Data
At JetBrains, we build tools that help teams work with complex systems in a more productive and enjoyable way. As AI becomes part of everyday workflows, a new challenge is emerging for data teams: How can you enable AI-assisted analytics without sacrificing accuracy, transparency, or control over data?
Today, we’re introducing Databao, a new data product from JetBrains designed to bring reliable semantic context to data teams, with the ability to build your own AI agents on top of it. We’re aiming to build an AI-native analytics tool that business users can rely on, alongside the dashboards that teams use every day.
As part of this work, we invite data teams to get in touch with us and launch a Proof of Concept to enable self-serve analytics for business users, discuss your team needs and share feedback throughout the journey.
Why Databao?
Databao’s mission
Modern data workflows are evolving quickly. Teams need flexibility and scalability as AI becomes a core part of how insights are generated. Sharing and reusing domain context, trusting AI-generated results, and scaling analytics without increasing complexity are some of the main challenges for companies.
Databao was built to solve practical problems that data teams face nowadays:
- Enabling business users to ask their own data questions in plain language.
- Relying on consistent, governed business definitions across analyses.
- Getting more accurate, repeatable results from AI-assisted workflows.
- Reducing manual back-and-forth and ad-hoc requests.
In practice, this means enabling personalized, self-service analytics that are controllable, scalable, and continuously improve over time.
Providing a self-maintaining semantic layer for companies’ data
Databao’s CLI tool, the context engine, is designed to extract schema and metadata from data sources and give teams a governed layer that captures business logic and definitions from databases, BI tools, and documentation. This keeps context consistent and reusable.
As one of our Alpha users puts it:
“Before the context engine, I had to copy and paste my database schema into the LLM. Now I just point it to the data source and ask it to generate a query – and it works. No more incorrect column types, format mismatches, or hallucinations.”
Enabling agentic analytics
In addition to the context engine, the data agent, available as an open-source and local Python SDK, uses this context to enable users to query, clean, and visualize enterprise data, generating production-quality SQL and outputs that business users can trust.
Another of our early users shared: “The Databao agent joined three to four tables perfectly, which no other data agent can do. I’m literally happy with this.”
The platform that brings it all together
Databao is designed for teams and the people who implement and own data tooling: analytics leads, data engineers, and platform teams.
It starts with a simple local setup and grows naturally as usage and complexity increase. From the first open-source building blocks, we are now evolving into a team-ready SaaS layer that brings shared context, collaboration, and production-grade reliability.
By avoiding vendor lock-in, working across tools, and adapting to different organizational setups, we also aim to make our platform suitable for any production environment, not just for experimentation.

Databao’s trust milestones
Over the past year, we’ve focused on understanding how structured context and a semantic layer can improve the accuracy of agentic answers. This research has informed the foundations of Databao and how we started building our product.
As a result, we recently reached two important milestones: achieving a first-place ranking in the DBT track of the SPIDER 2.0 Text-to-SQL benchmark – one of the most widely recognized evaluations for SQL generation – and joining the Open Semantic Interchange (OSI), an open-source initiative led by Snowflake and other industry leaders to define a shared, vendor-neutral standard for semantic models.
Let’s turn AI analytics into a working POC, together
We are excited to invite teams to build a proof of concept together with the Databao team. We’ll work with you to understand your use case, define a context-building process, and give the agent access to a selected group of business users. Together, we’ll then evaluate the quality of the responses and overall satisfaction with the results.
And if you’d like to explore Databao, you can already try both our context engine and data agent through our open-source libraries.