I’m trying to optimize ai_query calls on a table and wanted to get some ideas.
So far, I’ve tried repartitioning the DataFrame before running spark.sql(ai_query), but I didn’t see any meaningful performance gains. I also experimented with running multiple instances of the same notebook in parallel, but the improvements were marginal.
Has anyone tried a different approach that worked better? Any suggestions on how to improve performance or scale this more efficiently?