Re: Ai query parallel calls - Databricks Community - 146809

Register to join the community

Generative AI

Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.

I’m trying to optimize ai_query calls on a table and wanted to get some ideas.

So far, I’ve tried repartitioning the DataFrame before running spark.sql(ai_query), but I didn’t see any meaningful performance gains. I also experimented with running multiple instances of the same notebook in parallel, but the improvements were marginal.

Has anyone tried a different approach that worked better? Any suggestions on how to improve performance or scale this more efficiently?

1 REPLY 1

When you are using ai_query(), there are two main aspects to performance:

Model serving endpoint
SQL warehouse / Compute cluster

Very likely, the performance is throttled by the model-serving endpoint's concurrency limit. Reference: https://docs.databricks.com/aws/en/machine-learning/model-serving/model-serving-limits

Can you share more about your model serving endpoint?

never-displayed

You must be signed in to add attachments

never-displayed

Announcements

The Next Wave of Enterprise AI | Webinar

🌟 Community Pulse: Your Weekly Roundup! June 29 – July 05, 2026

Solution Accelerator Series | Identify Fraud With Geospatial Analytics and AI

Databricks Community Champion - June 2026 - Amira Bedhiafi