<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Ai query parallel calls in Generative AI</title>
    <link>https://community.databricks.com/t5/generative-ai/ai-query-parallel-calls/m-p/146809#M1602</link>
    <description>&lt;P&gt;I’m trying to optimize ai_query calls on a table and wanted to get some ideas.&lt;/P&gt;&lt;P&gt;So far, I’ve tried repartitioning the DataFrame before running spark.sql(ai_query), but I didn’t see any meaningful performance gains. I also experimented with running multiple instances of the same notebook in parallel, but the improvements were marginal.&lt;/P&gt;&lt;P&gt;Has anyone tried a different approach that worked better? Any suggestions on how to improve performance or scale this more efficiently?&lt;/P&gt;</description>
    <pubDate>Wed, 04 Feb 2026 13:12:38 GMT</pubDate>
    <dc:creator>joaoaugustofb</dc:creator>
    <dc:date>2026-02-04T13:12:38Z</dc:date>
    <item>
      <title>Ai query parallel calls</title>
      <link>https://community.databricks.com/t5/generative-ai/ai-query-parallel-calls/m-p/146809#M1602</link>
      <description>&lt;P&gt;I’m trying to optimize ai_query calls on a table and wanted to get some ideas.&lt;/P&gt;&lt;P&gt;So far, I’ve tried repartitioning the DataFrame before running spark.sql(ai_query), but I didn’t see any meaningful performance gains. I also experimented with running multiple instances of the same notebook in parallel, but the improvements were marginal.&lt;/P&gt;&lt;P&gt;Has anyone tried a different approach that worked better? Any suggestions on how to improve performance or scale this more efficiently?&lt;/P&gt;</description>
      <pubDate>Wed, 04 Feb 2026 13:12:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/ai-query-parallel-calls/m-p/146809#M1602</guid>
      <dc:creator>joaoaugustofb</dc:creator>
      <dc:date>2026-02-04T13:12:38Z</dc:date>
    </item>
    <item>
      <title>Re: Ai query parallel calls</title>
      <link>https://community.databricks.com/t5/generative-ai/ai-query-parallel-calls/m-p/147323#M1614</link>
      <description>&lt;P&gt;When you are using ai_query(),&amp;nbsp;there are two main aspects to performance:&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;
&lt;OL&gt;
&lt;LI&gt;Model serving endpoint&lt;/LI&gt;
&lt;LI&gt;SQL warehouse / Compute cluster&amp;nbsp;&amp;nbsp;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Very likely, the performance is throttled by the model-serving endpoint's concurrency limit. Reference:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/machine-learning/model-serving/model-serving-limits" target="_self"&gt;https://docs.databricks.com/aws/en/machine-learning/model-serving/model-serving-limits&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Can you share more about your model serving endpoint?&lt;/P&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 06 Feb 2026 22:17:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/ai-query-parallel-calls/m-p/147323#M1614</guid>
      <dc:creator>pavannaidu</dc:creator>
      <dc:date>2026-02-06T22:17:47Z</dc:date>
    </item>
  </channel>
</rss>

