Databricks Community

Zach_Jacobson23 · ‎06-02-2025

Introduction

A Databricks AI/BI Genie Space is a powerful natural language (Text-to-SQL Engine) interface that bridges the gap between business users and complex data analysis. Designed specifically for translating everyday business questions into precise SQL queries, a Genie Space (commonly referred to as simply “Genie”) enables users to interact with their data conversationally—no coding required. By leveraging generative AI fine-tuned to an organization’s data, terminology, and context, Genie makes data exploration more intuitive and accessible. It allows domain experts to configure Genie Spaces with curated datasets, sample queries, and tailored instructions, ensuring reliable and relevant responses. This leads to a more agile decision-making process, reduced reliance on technical teams for reporting, and scalable self-serve analytics across the business.

Fig. 1: Example Genie Space

An Agent-Based Approach

At a technical level, Genie’s domain specificity is one of its greatest strengths. Each Genie Space is linked to a curated set of tables in Unity Catalog, making it finely tuned to the structure, semantics, and context of a specific data domain. This setup works particularly well in scenarios tied to specific teams or business units, where a single Genie Space—containing 5–10 gold-level tables—can provide rich insights. These tables are well-defined, enriched with detailed metadata, and enhanced with example SQL queries and clear instructions.

However, business questions often span multiple domains. Simply adding more tables to one Space isn’t always effective and is not recommended. Instead, an agent-based approach offers a more scalable solution. By connecting multiple Genie Spaces—each focused on a specific domain or business function—an agent can act as an orchestrator, treating each Space as a tool to query when relevant. This allows users to tap into multiple specialized Genie Spaces at once, combining their insights to answer broader or more complex questions. Let’s take a closer look at how this works in practice, and walk through an example setup.

Example Use Case: An Agent Orchestrating Multiple Genie Spaces

Consider a large manufacturing company that operates several facilities across different regions. The operations team might use a Genie Space dedicated to production line efficiency, which includes metrics like machine uptime, throughput rates, and maintenance logs. Meanwhile, the supply chain team may rely on a separate Genie Space focused on inventory and logistics, containing data about raw material levels, supplier performance, and shipping schedules. Finance teams might manage their own Genie Space covering cost analysis and budget tracking, using tables that capture operational spend, procurement costs, and forecast data.

Individually, each of these Genie Spaces serves its audience well, allowing respective team members to ask questions about the underlying data in natural language. But what if a business user wants to understand how supply chain delays are impacting production throughput and overall cost per unit? No single Genie Space has the full picture. This is where an agent-based approach becomes powerful—coordinating across the production, supply chain, and finance Genie Spaces to answer cross-functional questions with precision and relevance.

Fig. 2: Agent Architecture Using Multiple Genie Spaces

Genie Spaces as Agent Tools Using Genie API

With the recent release of Genie API preview, Databricks Genie can now be embedded in AI applications. You can see a practical example of how a Genie conversation works with the API here. For our case, we can combine this API with Databricks Agent Framework and Unity Catalog functions to give an agent access to one or more Genie Spaces. The example below shows how we might do this in modularized functions with the necessary parameters and metadata. First we create our function:

Fig. 3: Unity Catalog Function - Genie API Call

%sql
CREATE OR REPLACE FUNCTION _genie_query(databricks_host STRING,
                 databricks_token STRING,
                 space_id STRING,
                 question STRING,
                 contextual_history STRING)
RETURNS STRING
LANGUAGE PYTHON
COMMENT 'This is an agent that you can converse with to get answers to questions. Try to provide simple questions and provide history if you had prior conversations.'
AS
$$

Then we can use our new function in an agent:

Fig. 4: Unity Catalog Function - Wrapper to Query Genie Space

CREATE OR REPLACE FUNCTION chat_with_production_line_effiency(question STRING COMMENT "the question to ask about amazon music reviews",
                 contextual_history STRING COMMENT "Provide relevant history to be able to answer this question & assume genie doesn't keep track of history. Use 'no relevant history' if there is nothing relevant to answer the question.")
RETURNS STRING
LANGUAGE SQL
COMMENT 'This is an agent that you can converse with to get answers to questions about amazon music reviews. Try to provide simple questions and provide history if you had prior conversations.'
RETURN SELECT _genie_query(
 "https://e2-dogfood.staging.cloud.databricks.com/",
secrets("genie_scope", "databricks_host_token"),
secrets("genie_scope", "databricks_space_id"),
 question, -- retrieved from function
 contextual_history -- retrieved from function
);

As an added bonus, when we embed these Genie API calls in Unity Catalog functions, we automatically get the benefit of Unity Catalog’s built-in governance!

Fig. 5: Unity Catalog Governs UC Functions

With our two functions set up, we can now have our agent utilize them as needed. Let’s look at how easy it is to prototype agents with Databricks Playground.

Building a Prototype with Playground

With Databricks Playground users can try out various LLMs with example prompts, get an out-of-the-box LLM as a judge, and attach custom tools–such as our Unity Catalog functions from above–all within the same UI!

Fig. 6: Databricks Playground

Let’s add our genie-calling tool from above to an agent in Playground.

Fig. 7: Adding a Tool to an Agent in Playground

Once a conversation with an AI agent has been prototyped within the playground, users can then export it to a notebook for further testing, to begin a deployment process, or continue developing the underlying Python code. These notebooks make up the boiler plate or “starter” code for building a fully functional agent.

Agent in Action with Databricks Apps

From here, we could make finer code adjustments as needed or simply run the exported notebooks to deploy our new agent. For our example, we’ll execute the notebooks as-is which gives us a deployed, ready-to-use agent served from a model-serving endpoint. The deployed agent also includes registered tools, in this case UC functions, that it needs to call the appropriate Genie Space. Let’s walk through an example below where our Agent uses the function to determine the correct Genie Space needed to answer the question.

When the user asks a question about baby products, the agent intelligently and properly routes to the Genie Space with baby data.

Fig. 8: User Question - Genie Embedded in a Databricks App

When the appropriate tool is invoked—our Genie Space in this case—it responds by displaying the original question along with any relevant contextual history. Because this is our first query, there’s no prior context to show just yet.

Fig. 9: Agent Logic

Next, Genie’s text-to-SQL engine responds with tabular results. The response includes the customer ID and how many times the top customer has reviewed baby products.

Fig. 10: Agent Provides Answer Based on Genie Results

It also includes the underlying SQL used to retrieve the data, which is helpful for debugging:

Fig. 11: Example SQL Generated by Genie

WITH ranked_reviews AS (
 SELECT
   `amazon_review_books`.`customer_id`,
   COUNT(`amazon_review_books`.`review_id`) AS review_count,
   ROW_NUMBER() OVER (ORDER BY COUNT(`amazon_review_books`.`review_id`) DESC) AS rank
 FROM
   `mfg_mid_central_sa`.`zach_jacobson`.`amazon_review_books`
 WHERE
   MONTH(`amazon_review_books`.`review_date`) = 7
 GROUP BY
   `amazon_review_books`.`customer_id`
)
SELECT
 `customer_id`,
 `review_count`
FROM
 ranked_reviews
WHERE
 rank = 1

We can now follow the same steps to add additional Genie Spaces as tools. The result is a fully operational agent that intelligently routes user questions to the appropriate Genie Space based on context. Each Genie Space functions as a Unity Catalog-governed UC function, ensuring governance and consistency. Using the Databricks Playground, we rapidly progressed from a prototype to a deployed, production-ready agent—seamlessly embedded within a Databricks App for easy access and interaction.

Conclusion

Databricks AI/BI Genie Spaces bring a new level of accessibility and agility to data-driven decision-making by combining natural language interfaces with curated datasets and domain-specific expertise. The agent-based approach takes this even further—enabling seamless collaboration across multiple specialized Genie Spaces to tackle complex, enterprise-wide questions. With the integration of the new Genie API, Unity Catalog functions, and Databricks Playground, building and deploying intelligent, context-aware agents is now faster and more streamlined than ever.

aviadfe · ‎06-02-2025

Hi, thanks for the great article!
In Fig. 3: Unity Catalog Function - Genie API Call, it looks like the code snippet showing the actual call to the Genie Space is cut off, so the code itself isn’t visible.

I’ve created a similar Unity Catalog function by myself. I was able to run it successfully as an SQL function both inside a Notebook and on SQL Compute. However, when I tried to use it within the Playground, it failed to execute and just returned a message saying the operation was stopped after 30 seconds with no success.

Zach_Jacobson23 · ‎06-05-2025

Make sure you are creating all of the functions in the same catalog and schema. You need to set those in the notebook. If it still doesn't work you might have to send some screen shots

akr · ‎06-20-2025

I feel there should be a simpler way to make genie available as a tool jn agents, something like a wrapper system.ai function.This could make it much easier to integrate.

in few cases , genie returns the results whixh go beyond token size limits of llm models . could there be any way to avoid such instances .

Databricks Community

Leverage the Power of Multiple Genie Spaces Inside an App using Agents

Metadata-Driven ETL Framework in Databricks (Part-1)

Top 10 query performance tuning tips for Databricks Serverless SQL

Best practices for safe data experimentation with Databricks