Databao

Agentic platform with modular AI tools and a governed semantic layer for any data stack

Speeding up analytics with Databao

Guja is currently an analytics engineer at Carnival Maritime, one of the world’s largest leisure travel and cruise companies. As one of our first alpha users, Guja tried Databao’s context engine, a CLI that extracts schema and metadata from data sources so AI agents can reason over them reliably.  

We spoke with him about what drew him to Databao and how it helped speed up his ad-hoc analytics work in a complex data environment.

What problem were you trying to solve when you found Databao?

I was looking for help with data discovery – essentially, a way to wrap our data marts so I could “chat” with our data. 

How would you describe the data you were working with at Carnival Maritime?

Everything that exists on a cruise ship ends up as data somewhere, from engine temperatures to weather forecasts. Because of that, the data landscape is very complex. It’s spread across many databases, domains, schemas, and teams, and understanding the full context behind the data is difficult.

Before Databao, how did you try to “chat” with your data?

Unless you’re working with a single table in a single database, context is required. About 95% of the work was explaining context to the agent and only 5% was the actual question.

We looked at existing solutions, but none really fit. Most solved only one part of the workflow or came with vendor lock-in, which I wanted to avoid.  

So, I tried building a data chatbot myself by stitching together a schema extraction engine, a context generator, and a text-to-SQL model. In the end, they didn’t mesh well together.

What exactly was hard about providing context to agents?

When you do ad-hoc work on databases and start using LLMs or agents, you have to explain what your tables mean, how they relate to each other, how joins should work, and what the business or technical context is. 

If the LLM or agent doesn’t really understand that context, you quickly get into a mental state where you start thinking that your schemas or tables are bad just because the LLM can’t produce the correct SQL or answer. 

In reality, the problem is often missing context, not bad data modeling. 

That’s why tools that extract schema and context from databases and provide it to LLMs or agents are useful – they help bridge that gap and reduce this mental and technical friction.

How did Databao change how you work?

I can spend more time on analysis instead of data plumbing.

These days, it’s almost normal for analytics engineers to spend most of their time cleaning and managing data instead of doing analysis. This is especially true for ad-hoc work. 

Let’s say you have one day to answer a question. You might not even get around to building a dashboard until the last 20 minutes because you spent all day just getting the data together.

A data engineer moves data from A to B, while an analytics engineer moves KPIs from A to B. You are trying to balance engineering work with an analytical outcome. The problem is that today this balance is off. There is too much engineering, and analysis only happens if everything else is manageable.

Why we built Databao

Guja’s challenge with providing context to AI is why we built Databao’s context engine . It’s a Python library that automatically generates a governed semantic context from data sources like databases and dbt projects. It runs locally in your environment and integrates with any LLM to deliver accurate, context-aware answers.

The context engine is part of the Databao platform enabling self-serve analytics. If you’re on a data team looking to make data more accessible to business users, we’d love to talk. Get in touch with us to launch a proof of concept, discuss your needs, and share feedback. 

TALK TO THE TEAM