Databricks Community

michael365 · ‎05-20-2026

Hi everybody,

I'm using Genie for some time which is really great. In order to improve and before recommending it more users within our company I need to get more insights on Genie including but not limited to:

Is it correct that the “thinking” part is not being executed on the associated Databricks clusters rather than on a VM (or similar runtime) on appropriate Cloud provider resources managed by Databricks?
If so, is there only one LLM ‘execution engine’ as per (Azure) subscription? Or per (Azure) tenant? Or multiple?
Is the LLM ‘execution engine’ the same for Genie Spaces and Genie Code?
Or how many are available within the background?
Performance within our enviroment is decreasing month by month. Currently 20 seconds for response on "WHat can you do for me?" 😞
What can I do to change it? Request change to Databricks account?
Which (customized) LLM is used by Genie? GPT or Claude?

Would be great to get insights on this as otherwise using Genie and Databricks more will be difficult.

Ashwin_DSA · ‎05-21-2026

Hi @michael365,

Glad you like Genie Code. I answered similar questions recently. Here is a link to that post. It also has some documentation links.

Now.. to your questions... At a high level, yes. For Genie Spaces, the "thinking" or natural-language interpretation is not happening on the attached SQL warehouse itself. Databricks describes Genie as a compound AI system, and the public architecture/docs show the LLM-assisted interpretation happening in Databricks-managed AI services, after which Genie generates read-only SQL that is executed on the SQL warehouse.

On the question of whether there is one single execution engine per Azure subscription or tenant, I have not seen public documentation that defines it that way. What the docs do say is that these AI features are designated services, routed according to Geo/region, and the actual model/provider can vary by feature and mode rather than mapping cleanly to a single tenant- or subscription-level engine.

It is also not quite correct to assume that Genie Spaces and Genie Code always use the same backend. Public docs show that Genie Spaces chat uses Azure AI Services / Azure OpenAI, while Genie Code chat and cell actions also use Azure AI Services, and Agent mode can use Azure AI Services or Anthropic on Databricks, depending on the feature and region. In some configurations, Genie Code can also use Databricks-hosted models when partner-powered AI features are disabled.

For the performance issue, I would first separate SQL execution time from overall Genie response time. If the generated SQL is slow, then the warehouse/query path is the main thing to optimise. If the SQL is relatively fast but the end-to-end response is still slow, the bottleneck is more likely in the Genie itself. For example, too much context, too many tables/columns, ambiguous semantics, or too much free-form instruction text. Databricks’ own guidance is to keep the space narrow, simplify the model with curated views or metric views, use stronger metadata/knowledge-store/example SQL, and benchmark with realistic questions.

So my practical recommendation would be...measure a few representative prompts end-to-end, compare them with the SQL execution times in Query History, and then tune the right layer. If the issue is repeatable and worsening over time, it would also be worth raising with Databricks support, along with concrete examples, timestamps, workspace/space details, and a latency breakdown. That will make it much easier to tell whether the issue is space design, warehouse performance, regional routing, or service-side latency.

On the model question specifically...the safest public answer today is "it depends on the feature and configuration" rather than "it is always GPT" or "it is always Claude." The documented providers across Databricks AI assistive features are Azure OpenAI / Azure AI Services, Anthropic on Databricks for some agentic modes, and Databricks-hosted models in some non-partner-powered configurations.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

Ashwin_DSA · ‎05-21-2026

Hi @michael365,

Glad you like Genie Code. I answered similar questions recently. Here is a link to that post. It also has some documentation links.

Now.. to your questions... At a high level, yes. For Genie Spaces, the "thinking" or natural-language interpretation is not happening on the attached SQL warehouse itself. Databricks describes Genie as a compound AI system, and the public architecture/docs show the LLM-assisted interpretation happening in Databricks-managed AI services, after which Genie generates read-only SQL that is executed on the SQL warehouse.

On the question of whether there is one single execution engine per Azure subscription or tenant, I have not seen public documentation that defines it that way. What the docs do say is that these AI features are designated services, routed according to Geo/region, and the actual model/provider can vary by feature and mode rather than mapping cleanly to a single tenant- or subscription-level engine.

It is also not quite correct to assume that Genie Spaces and Genie Code always use the same backend. Public docs show that Genie Spaces chat uses Azure AI Services / Azure OpenAI, while Genie Code chat and cell actions also use Azure AI Services, and Agent mode can use Azure AI Services or Anthropic on Databricks, depending on the feature and region. In some configurations, Genie Code can also use Databricks-hosted models when partner-powered AI features are disabled.

For the performance issue, I would first separate SQL execution time from overall Genie response time. If the generated SQL is slow, then the warehouse/query path is the main thing to optimise. If the SQL is relatively fast but the end-to-end response is still slow, the bottleneck is more likely in the Genie itself. For example, too much context, too many tables/columns, ambiguous semantics, or too much free-form instruction text. Databricks’ own guidance is to keep the space narrow, simplify the model with curated views or metric views, use stronger metadata/knowledge-store/example SQL, and benchmark with realistic questions.

So my practical recommendation would be...measure a few representative prompts end-to-end, compare them with the SQL execution times in Query History, and then tune the right layer. If the issue is repeatable and worsening over time, it would also be worth raising with Databricks support, along with concrete examples, timestamps, workspace/space details, and a latency breakdown. That will make it much easier to tell whether the issue is space design, warehouse performance, regional routing, or service-side latency.

On the model question specifically...the safest public answer today is "it depends on the feature and configuration" rather than "it is always GPT" or "it is always Claude." The documented providers across Databricks AI assistive features are Azure OpenAI / Azure AI Services, Anthropic on Databricks for some agentic modes, and Databricks-hosted models in some non-partner-powered configurations.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

michael365 · ‎05-26-2026

Hi @Ashwin_DSA

thanks for your response.

Regarding the performance issue the SQL is fast but the end-to-end response is still slow. So I assume that the bottleneck is Genie itself.
I also resized the space in terms of number data sets added and improved and curated it based on documentation provided by Databricks - but still it's slow.

The performance was much better in e.g. February and March this year and slowly been decreased over last weeks. So the same Genie Spaces and same prompts need more time than weeks before.

My assumption therefore is that more users are using the same execution engine or that it does not scale.

Finally my current perspective is that Genie Spaces can (unfortunately) not be offered to or used by business teams. I mean they understand that complex prompts will need some minutes but the answer on "What can you do for me?" takes 20 seconds - even in Chat mode...this is far too long

I will no make further tests and measure performance again. If it does not help I will raise ticket to Databricks support team.

Thanks again for help.

Databricks Community

How does Genie look under the hood

The Next Wave of Enterprise AI | Webinar

🌟 Community Pulse: Your Weekly Roundup! June 29 – July 05, 2026

📌‌ Complete Your Profile – Help Others Get to Know You

Solution Accelerator Series | Identify Fraud With Geospatial Analytics and AI

Upcoming Community BrickTalk: Bringing (Geo)Spatial Awareness to your Conversational Agents