Databricks Community

juliechen · yesterday

Hi everyone,

I’m curious if anyone has successfully implemented Databricks Genie (chat/agent) for production use.

Currently, we’ve enabled a few Genie instances for power users who are comfortable working with data outside of the data team. However, we’re evaluating whether AI analytics tools like Genie are mature enough to be rolled out more broadly across the organization.

From my experience, even as a data team leader, I still find it necessary to carefully validate both prompts and outputs, especially for more complex questions that require connecting multiple business domains. But what Genie recommended confused me, screenshot below

Has anyone deployed Genie analytics at scale and opened access to a wider audience?

How did you manage accuracy, validation, and user expectations?
Did you limit use cases or enforce governance frameworks before scaling?

Would love to hear how others are approaching this. Thanks!

emma_s · yesterday

Hi, I think you forgot to attach a screenshot. I'd be keen to see what you're seeing.

juliechen · yesterday

Hi @Emma

The recommendation was from Genie code as attached below.

To me, this is not curated issue, it's 1. maturity of LLM and 2. user prompt quality. I observed several times that users asked questions with confusing perspective, Genie could decode it in a wrong way and provide wrong result

emma_s · 8 hours ago

Hi Julie, thanks for sharing the screenshot. There are a few things we'd recommend to customers when they are starting out on a Genie journey, we defintiely see enterprises creating Genie spaces at scale but it needs some thought and guardrails. Some recommendations for you, from one of our internal resources:

The goal of curating a space is to ensure your users can answer their questions accurately and consistently. Genie spaces are equipped with best-in-class models that are capable of generating sophisticated queries and have general knowledge about the world at large to interpret user questions. However, most business questions are domain-specific. Consequently, the role of a space curator is to fill these gaps through a combination of metadata and instructions. This requires iteration and practice, but this document aims to capture best practices and principles to create an effective space.

Have a domain expert define the space: An effective space creator needs to understand the data and insights that can be gleaned from it. Data analysts (who are proficient in SQL) typically have the desired knowledge and skill set to curate the space effectively.
Define the purpose of your space: Who should be using it? What kind of questions should it be able to answer? Having a clear purpose and audience makes it easier to design your space appropriately.

A space is intended to answer questions from a focused topic, rather than answering general questions across a wide range of domains.
Use a small set of focused, specific tables that speak to the space’s defined purpose, rather than a large set of tables that include data from many parts of your business. By providing a specific set of necessary data, you can reduce the time it takes the space to scan for relevant data, and eliminate potential errors caused by using incorrect data.

Start small: It is impossible to define a fully complete space when you’re first creating it. Start with a minimal set of instructions, a small target set of questions that you aim to satisfy, and iterate from there. Starting with too wide a scope of data and questions makes it difficult to start and difficult to iterate. Use only data that is closely related to the insights that you want to provide in the space.

Stay focused; don’t just dump every table you can think of in there. Instead, only add what’s necessary to answer the questions you’re targeting.
Don’t worry about getting the space right off the bat. Instead, provide a best-effort set of tables, then start testing the room with some questions you’d anticipate business users to ask. Don’t worry about adding instructions initially, you can do so as a part of the iterative loop while testing with your questions.
Good spaces start with good input tables. Have clear column names and column descriptions in all tables used for a space.
Column descriptions should provide clear, precise context that you wouldn’t expect the space to infer on its own (e.g. isn’t “common public knowledge”). Avoid spurious and ambiguous descriptions. Specifically, don’t blindly use AI-generated comments. Only include them if they match what you would add manually to provide context that would otherwise be missing and useful for answering anticipated questions.
Don’t add too many tables. Aim for 5 or less tables. The more focused it can be, the better.
Ideally, keep the tables to a smaller number of columns, aiming for 25 or less. However, if you have a table with a large number of columns, it’s better to keep it as-is rather than split it into multiple tables, as that would lead to diminished accuracy and performance.

Start asking questions: Carefully examine the SQL generated in response to your questions. Sometimes the space will be very good with your tables out of the box, but often it will require additional help in understanding your data.

Whenever you see an incorrect query, refer to our Troubleshooting section for ideas on how to fix it.
Once you’ve fixed it, try another question, and continue iterating. Queries can be edited in place and saved as example queries.
After updating instructions, re-ask questions in a new chat to make sure the latest instructions are being used.

Have business users test your space: Once your space is able to answer all the questions you tried, involve a business user and let them try the space. Make sure you set expectations with them (e.g. via training sessions or materials) on what kind of questions are expected to be answerable. They can vote the response up and down as they test the space, and you’ll be able to see the questions they’ve asked from the Monitoring tab. Continue adding instructions to steer the space into providing the right answers.

Set the expectation with the end users that if the initial answer is not right and they can discern this from either the description of the answer or the actual answer itself, they can continue to provide refinements through an iterative chat process to try to get to the right answer. This could include clarifying what is wrong with the answer, rephrasing key elements, etc. After they are satisfied, they can upvote and save the final query to be part of the instructions to minimize repetition of the error.
If they’re not sure, they should contact the author of the room (which could be you) so that the author can investigate and potentially add more instructions to improve how the space handles that particular question.
In order for business users to access your room, they need to be:

Part of the workspace that the room is in (we have a feature to remove this requirement coming later in Q2)
Have “Can Run” permission on the room.
Have USE CATALOG permission on the catalogs for the tables in the room, and SELECT permission on the tables in the room.
Have CAN USE permission on the warehouse used by the room

You can the key is to test and iterate, observing the history of what has been asked and what Genie delivered.

I hope this helps.

Thanks,

Emma

juliechen · 2 hours ago

Hi @emma_s

Thanks for the feedback, really appreciate the perspectives shared.

To provide a bit more context, we’ve already built several Genie instances and granted access to a group of power users internally, and use Genie agent production analysis for a while.

My question is more about the longer-term impact of AI analytics. From our experience, Genie doesn’t seem to add significant value if it’s primarily used to answer straightforward questions like “last month’s sales by X dimension YoY growth”. These types of standardized metrics are already well served by our BI tool.

When it comes to analytics, I tend to see two distinct layers:

Standardized metrics → live in BI tools, where users can easily access consistent
Exploration / deep research → where the real value lies, especially when connecting multiple domains to uncover insights

In our case, we’ve been leveraging Genie chat more for the second layer — enabling analysis that correlates multiple business signals, such as: marketing spend, traffic by channel, MTA attribution, onsite engagement, conversion and sales... etc

This kind of cross-domain exploration allows us to answer more complex “what if” questions and significantly reduce the time required (often replacing work that would otherwise take 10–20x more manual effort).

That’s why I’m particularly interested in how others are using Genie — especially the Genie agent capabilities for deeper research and exploratory analysis.

At the end of the day, I don’t think the true impact of analytics comes from making standardized metrics available. It comes from answering the hardest questions: Where should we invest next? Where are the hidden opportunities? What risks are we not seeing yet?

Given that, I’m curious — based on what you've seen, does it still make sense to limit Genie primarily to power users, or have you seen success scaling this type of deep exploratory analytics more broadly across the organization?

Thanks again, and looking forward to hearing your thoughts.

Databricks Community

Databricks Genie: Power Users vs. Broad Org Access

Data+AI Summit 2026 | Get hands on with AI

Agent Bricks | A Pilot to Production Series - Financial Services

Databricks Community Champion - April 2026 - Ashwin Varadharajan

Solution Accelerator Series | Analyze Customer Lifetime Value

🌟 Community Pulse: Your Weekly Roundup! April 06 – 12, 2026