<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How can I execute a Spark SQL query inside a Unity Catalog Python UDF so I can run downstream ML? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-can-i-execute-a-spark-sql-query-inside-a-unity-catalog/m-p/121784#M46548</link>
    <description>&lt;P&gt;I want to build an LLM-driven chatbot using Agentic AI framework within Databricks. The idea is for the LLM to generate a SQL text string which then passed to a Unity Catalog-registered Python UDF tool. Within this tool,&amp;nbsp; I need the SQL to be executed (based on the SQL text string it receives from the LLM) so I can immediately run a machine learning model on the returned data. I am avoiding passing data directly to the Python UDF tool to avoid blowing token limits. This is the main reason for me trying to just pass a SQL text string to the Python UDF and have the SQL run within that.&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, any attempt to call spark.sql() or instantiate a SparkSession in my SQL-defined Python UDF fails under the SafeSpark sandbox (there is no global spark available, and SparkContext creation is blocked).&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Is there a supported way&lt;/STRONG&gt; for a SQL-defined Python UDF to invoke Spark SQL directly inside Unity Catalog?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;If not&lt;/STRONG&gt;, what production-quality patterns let me register a “query-driven” Python function in UC—one that takes only a SQL string and under the hood fetches the DataFrame and applies ML logic?&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Similar questions have been asked before but without satisfactory resolution&lt;/P&gt;&lt;P&gt;Any pointers or examples would be greatly appreciated!&lt;/P&gt;</description>
    <pubDate>Sat, 14 Jun 2025 17:10:56 GMT</pubDate>
    <dc:creator>ravimaranganti</dc:creator>
    <dc:date>2025-06-14T17:10:56Z</dc:date>
    <item>
      <title>How can I execute a Spark SQL query inside a Unity Catalog Python UDF so I can run downstream ML?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-execute-a-spark-sql-query-inside-a-unity-catalog/m-p/121784#M46548</link>
      <description>&lt;P&gt;I want to build an LLM-driven chatbot using Agentic AI framework within Databricks. The idea is for the LLM to generate a SQL text string which then passed to a Unity Catalog-registered Python UDF tool. Within this tool,&amp;nbsp; I need the SQL to be executed (based on the SQL text string it receives from the LLM) so I can immediately run a machine learning model on the returned data. I am avoiding passing data directly to the Python UDF tool to avoid blowing token limits. This is the main reason for me trying to just pass a SQL text string to the Python UDF and have the SQL run within that.&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, any attempt to call spark.sql() or instantiate a SparkSession in my SQL-defined Python UDF fails under the SafeSpark sandbox (there is no global spark available, and SparkContext creation is blocked).&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Is there a supported way&lt;/STRONG&gt; for a SQL-defined Python UDF to invoke Spark SQL directly inside Unity Catalog?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;If not&lt;/STRONG&gt;, what production-quality patterns let me register a “query-driven” Python function in UC—one that takes only a SQL string and under the hood fetches the DataFrame and applies ML logic?&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Similar questions have been asked before but without satisfactory resolution&lt;/P&gt;&lt;P&gt;Any pointers or examples would be greatly appreciated!&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jun 2025 17:10:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-execute-a-spark-sql-query-inside-a-unity-catalog/m-p/121784#M46548</guid>
      <dc:creator>ravimaranganti</dc:creator>
      <dc:date>2025-06-14T17:10:56Z</dc:date>
    </item>
    <item>
      <title>Re: How can I execute a Spark SQL query inside a Unity Catalog Python UDF so I can run downstream ML</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-execute-a-spark-sql-query-inside-a-unity-catalog/m-p/134059#M50004</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;There is currently&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;no supported method for SQL-defined Python UDFs in Unity Catalog to invoke Spark SQL or access a SparkSession directly from within the SafeSpark sandbox&lt;/STRONG&gt;. This limitation is by design: the SafeSpark/Restricted Python Execution Environment enforced by Unity Catalog prohibits access to global Spark objects, SparkContext, or creation of new SparkSessions within the Python UDF scope. As a result, any pattern that expects the UDF to “consume” an arbitrary SQL query, run it, and fetch results within the UDF itself, is fundamentally unsupported in this environment .&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Why This Limitation Exists&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Security&lt;/STRONG&gt;: Unity Catalog enforces strict data boundary and execution context controls to protect data access and lineage.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Isolation&lt;/STRONG&gt;: UDF execution is sandboxed to prevent privilege escalation and lateral movement.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Resource Management&lt;/STRONG&gt;: Spark driver/session lifecycles are managed outside the scope of function execution, and cannot be manipulated inside the UDF.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Supported Patterns &amp;amp; Best Practices&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If the architectural goal is to let LLM/Agentic AI output dynamically-generated SQL, there are recommended production-quality alternatives:&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;1.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Orchestrate SQL execution outside the UDF&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Run the LLM-generated SQL outside the Python UDF, within the main Spark driver/notebook/job code.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Get the desired DataFrame as the result.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Pass the DataFrame to a registered Python function (not as a UDF, but e.g., as a Pandas UDF or apply logic directly).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This avoids token blow-up since only the result set—not the full data—is passed to the function.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;2.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Python Function Registration with Arguments&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Register a function that receives the DataFrame as an argument,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;not&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;the SQL string.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Or, use a function that takes parameters for dynamic filtering/aggregation (but not an arbitrary SQL string) and run the query externally.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;3.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Use Databricks SQL functions or Procedures&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;LLM generates SQL/procedure invocations.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Register more logic as SQL Procedures or Table-valued Functions (TVFs) that can internally run arbitrary SQL but with restricted logic, not arbitrary Python ML.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;4.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Agent Pattern Outside UDF&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;LLM outputs&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;both&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;the SQL string and a symbolic reference to which ML logic to apply.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The Spark driver/job runs the SQL statement, stores result in a temporary or managed view, then invokes the downstream Python ML function with the evaluated DataFrame.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;5.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;MLflow Pipelines/Mappings&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;For cases where the operation is always “run this model on this query,” maintain a registry (MLflow, Delta Live Tables, custom mapping) from model-id ↔ ML function, and orchestrate with job workflows.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Why “SQL String as UDF Input” is Not Supported&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The UDF runs in a completely different (sandboxed) context without Spark connectivity.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;UDFs can manipulate Row input, not metafunction the Spark engine.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;No internal SparkSession is available; creating new ones is explicitly blocked.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Example: Supported Orchestration&lt;/H2&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl sticky top-0 flex h-0 items-start justify-end"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-[3px] font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="pr-lg"&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;# Orchestration pattern outside UDF&lt;/SPAN&gt;
query &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; llm&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;generate_sql&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;query_context&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; spark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;sql&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;query&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;# Apply ML logic&lt;/SPAN&gt;
result &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; my_ml_model&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If LLM needs to select the ML logic, it can output both the query and the ML routine identifier.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Further Reading / References&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Databricks documentation on SafeSpark sandbox and Unity Catalog Python UDF limitations.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Databricks forum discussions on why spark.sql() is unavailable in Python UDFs.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Patterns for using ML and AI with Unity Catalog, best practice guides.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;In summary:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;It is not possible to directly invoke Spark SQL from a SQL-defined Python UDF inside Unity Catalog. All production-quality solutions require running the SQL query and fetching results&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;outside&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;the UDF, then feeding the result DataFrame to downstream logic (ML or otherwise).&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2025 12:12:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-execute-a-spark-sql-query-inside-a-unity-catalog/m-p/134059#M50004</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-10-07T12:12:31Z</dc:date>
    </item>
  </channel>
</rss>

