Ai query parallel calls

joaoaugustofb · ‎02-04-2026

I’m trying to optimize ai_query calls on a table and wanted to get some ideas.

So far, I’ve tried repartitioning the DataFrame before running spark.sql(ai_query), but I didn’t see any meaningful performance gains. I also experimented with running multiple instances of the same notebook in parallel, but the improvements were marginal.

Has anyone tried a different approach that worked better? Any suggestions on how to improve performance or scale this more efficiently?

pavannaidu · ‎02-06-2026

When you are using ai_query(), there are two main aspects to performance:

Model serving endpoint
SQL warehouse / Compute cluster

Very likely, the performance is throttled by the model-serving endpoint's concurrency limit. Reference: https://docs.databricks.com/aws/en/machine-learning/model-serving/model-serving-limits

Can you share more about your model serving endpoint?