Databao

Agentic platform with modular AI tools and a governed semantic layer for any data stack

AI Data Data Science

The New Role of Data Teams in the Agentic Analytics Era

Maksim Krivobok

Last week, in the first part of this series, we explained why two analysts can produce two different answers from the same data, and why that problem gets worse with AI agents. Without a shared semantic layer defining metrics and business logic, AI will generate answers faster, but not more reliably. Today we’ll focus on the practical consequences of this shift.

The data analyst role is changing fast. Data analytics as a discipline won’t disappear, but the center of gravity is shifting.

In the dashboards era, being a good data analyst meant writing queries, building charts, and pulling numbers quickly. In modern times, excelling in the field means defining metrics clearly, building semantic contracts, setting governance and versioning standards, designing guardrails, and controlling a system that delivers reliable and repeatable results.

You won’t be writing the story anymore – you’ll be defining the rules of the universe in which the story takes place. And by doing so, you’ll stop paying the trust tax.

This isn’t about buying a smarter model. It’s about building a foundation strong enough that even a weak model can’t produce a weak outcome.

Because here’s the truth: An AI system is only as trustworthy as the meaning you give it.

The 2026 must-have agentic analytics stack (if you want to keep your sanity)

To ensure AI-driven analytics are reliable, you need three foundational elements in place:

Metrics as code

Your metric definitions can’t live in someone’s head, a screenshot, or a generic dashboard that everyone uses.

They need to be standardized in code, in a system designed to define and enforce metrics consistently. Examples include dbt’s or Cube’s semantic layer approaches, LookML-style modeling, and similar patterns. The point is the methodology, not the vendor. The business definition must be executable.

Git-based everything

If you can’t answer questions like “When did we see a change in revenue?”, “Who’s responsible for this change?”, and “What else did it affect?”, then this isn’t a system you can trust – it’s guesswork.

Put metric definitions in Git, and require every change to go through pull requests and reviews. Yes, it may feel bureaucratic and tedious – until the day it saves you from presenting the wrong numbers to the board.

Hard guardrails

Agents need boundaries. They require real guardrails, not vague instructions like “Please follow the rules.”

Only use approved metrics and joins. If a metric doesn’t exist, don’t invent it! Escalate it to the data team for review. That’s what guardrails are for. They force LLM systems to operate within defined constraints, not improvise.

The next model is small teams of agents, not one big chatbot

A pattern is emerging in systems that actually work. They are building not a single agent that does everything, but rather a small set of agents, each with a clear role, checking one another. Examples of such agents include:

The discovery agent
(Or: “Wait – what do you mean by revenue?”)

This agent speaks with the business user and clarifies intent before touching the data. It asks the questions humans often forget to ask:

“Booked or recognized?”
“Gross or net?”
“Include refunds?”
“Which currency?”
“Which region?”
“Which date field?”
“Month-to-date or full month?”

The semantic layer authoring agent
(Or: “Find the official meaning.”)

This agent consults the semantic layer and maps the request to approved metrics.

If the metric exists, it selects it. If it doesn’t, it proposes a new metric based on available data and metadata. It never silently invents it. It produces a diff to be reviewed by a human.

The auditor agent
(Or: “Try to break it.”)

This agent acts as an independent reviewer. It inspects the generated query or metric usage and looks for missing filters, incorrect joins, double counting, time zone errors, or mismatches between the requested and delivered meaning.

In other words, it reads adversarially. This alone can prevent a surprising number of “looks right” failures.

The human-in-the-loop
(Human is still the boss)

And then a human signs off – not on every exploratory question, but on anything that becomes a metric or a shared report.

The workflow becomes simple: AI proposes the semantic layer change, and a data team member approves it (or rejects). The system moves faster without losing control.

That’s how you reduce the trust tax without turning your analytics team into full-time babysitters.

Market reality: Text-to-insight is splitting into camps

Today, you can already see the industry dividing into three broad directions:

End-to-end conversational analytics platforms

They aim to do everything: connect data, define meaning, answer questions, and generate insights.

While fast, these systems are risky if you need deep customization or strict governance. And before long, you may find yourself vendor-locked – a risk not to underestimate.

Enterprise BI + AI add-ons

You already have a BI ecosystem, so you bolt AI onto it. This approach works well if your semantic layer is mature. However, it’s painful if your definitions are fragmented.

Headless semantic infrastructure

You build a semantic layer as an engine, then plug in different UIs and different agents. This requires more upfront work, but it gives you control, portability, and the ability to evolve your truth layer without being locked into a single frontend.

If you care about trust at scale, this third path becomes more compelling over time, because it treats meaning as infrastructure, not a feature.

Databao is built on this paradigm, providing the building blocks for modern semantic infrastructure and enabling company-wide self-service analytics.

The big shift to watch: The Open Semantic Interchange (OSI)

A clear signal that the industry has finally acknowledged the problem in semantics is the launch, in September 2025, of the Open Semantic Interchange (OSI) – an initiative led by Snowflake and other industry partners to define a vendor-neutral standard for describing and exchanging semantic models (metrics, dimensions, relationships) across tools.

If OSI succeeds, we’ll move toward a world where your definition of revenue is portable, flowing from BI to agents to warehouses without being rewritten or trapped in a single tool’s private model format.

You can debate standards all day (and people will), but the direction is what’s significant. The meaning layer is becoming a first-class citizen.

This matters because agents aren’t going away, and dashboards are no longer the only consumers of analytics. Agents are consumers too, and they require a shared language even more than humans do.

About Databao

Databao is a new data product from JetBrains that helps data teams create and maintain a shared semantic context and build their own data agents on top of it. Our goal is to provide an AI-native analytics experience that business users can trust, enabling them to query and analyze data in plain language.

Databao’s modular components, the context engine and data agent, can run independently, either locally or within your existing infrastructure, using your own API keys.

We are also inviting data teams to build a proof of concept with us: we’ll explore your use case, define a context-building process, and grant agent access to a selected group of business users. Together, we will then evaluate the quality of responses and the overall value.

TALK TO THE TEAM

Trust, Two Truths, and the Coming Agent Swarm

Discover more

Picture a typical workday. You’re in a meeting. Someone asks the typical question: “So, how was revenue last month?” You pull up your dashboard and respond, “Looks like we’re up 5%.” The CFO then opens his laptop, checks his numbers, and says, “Well, from what I’m seeing, we’re down 2%.” …