Databricks Community

michael365 · Wednesday

Hi everybody,

I'm using Genie for some time which is really great. In order to improve and before recommending it more users within our company I need to get more insights on Genie including but not limited to:

Is it correct that the “thinking” part is not being executed on the associated Databricks clusters rather than on a VM (or similar runtime) on appropriate Cloud provider resources managed by Databricks?
If so, is there only one LLM ‘execution engine’ as per (Azure) subscription? Or per (Azure) tenant? Or multiple?
Is the LLM ‘execution engine’ the same for Genie Spaces and Genie Code?
Or how many are available within the background?
Performance within our enviroment is decreasing month by month. Currently 20 seconds for response on "WHat can you do for me?" 😞
What can I do to change it? Request change to Databricks account?
Which (customized) LLM is used by Genie? GPT or Claude?

Would be great to get insights on this as otherwise using Genie and Databricks more will be difficult.

Ashwin_DSA · Thursday

Hi @michael365,

Glad you like Genie Code. I answered similar questions recently. Here is a link to that post. It also has some documentation links.

Now.. to your questions... At a high level, yes. For Genie Spaces, the "thinking" or natural-language interpretation is not happening on the attached SQL warehouse itself. Databricks describes Genie as a compound AI system, and the public architecture/docs show the LLM-assisted interpretation happening in Databricks-managed AI services, after which Genie generates read-only SQL that is executed on the SQL warehouse.

On the question of whether there is one single execution engine per Azure subscription or tenant, I have not seen public documentation that defines it that way. What the docs do say is that these AI features are designated services, routed according to Geo/region, and the actual model/provider can vary by feature and mode rather than mapping cleanly to a single tenant- or subscription-level engine.

It is also not quite correct to assume that Genie Spaces and Genie Code always use the same backend. Public docs show that Genie Spaces chat uses Azure AI Services / Azure OpenAI, while Genie Code chat and cell actions also use Azure AI Services, and Agent mode can use Azure AI Services or Anthropic on Databricks, depending on the feature and region. In some configurations, Genie Code can also use Databricks-hosted models when partner-powered AI features are disabled.

For the performance issue, I would first separate SQL execution time from overall Genie response time. If the generated SQL is slow, then the warehouse/query path is the main thing to optimise. If the SQL is relatively fast but the end-to-end response is still slow, the bottleneck is more likely in the Genie itself. For example, too much context, too many tables/columns, ambiguous semantics, or too much free-form instruction text. Databricks’ own guidance is to keep the space narrow, simplify the model with curated views or metric views, use stronger metadata/knowledge-store/example SQL, and benchmark with realistic questions.

So my practical recommendation would be...measure a few representative prompts end-to-end, compare them with the SQL execution times in Query History, and then tune the right layer. If the issue is repeatable and worsening over time, it would also be worth raising with Databricks support, along with concrete examples, timestamps, workspace/space details, and a latency breakdown. That will make it much easier to tell whether the issue is space design, warehouse performance, regional routing, or service-side latency.

On the model question specifically...the safest public answer today is "it depends on the feature and configuration" rather than "it is always GPT" or "it is always Claude." The documented providers across Databricks AI assistive features are Azure OpenAI / Azure AI Services, Anthropic on Databricks for some agentic modes, and Databricks-hosted models in some non-partner-powered configurations.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***