<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Python UDF support in Unity Catalog and runtime 13.3? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/python-udf-support-in-unity-catalog-and-runtime-13-3/m-p/115449#M45085</link>
    <description>&lt;P class="p1"&gt;Great question — and yeah, what you’re seeing &lt;I&gt;is&lt;/I&gt; a bit of a confusing experience that trips up a lot of folks working with Unity Catalog (UC). Let’s break it down:&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; What’s Working for You&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.types import LongType
def squared_typed(s):
    return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This works because you’re using a &lt;SPAN class="s1"&gt;&lt;STRONG&gt;SQL-style Python UDF&lt;/STRONG&gt;&lt;/SPAN&gt; registered directly via &lt;SPAN class="s2"&gt;spark.udf.register&lt;/SPAN&gt;, which executes outside the context of a DataFrame transformation. This approach is currently &lt;SPAN class="s1"&gt;&lt;STRONG&gt;supported&lt;/STRONG&gt;&lt;/SPAN&gt; in Unity Catalog.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":cross_mark:"&gt;❌&lt;/span&gt; What’s Failing&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This version creates a &lt;SPAN class="s1"&gt;&lt;STRONG&gt;Python UDF as a Catalyst expression&lt;/STRONG&gt;&lt;/SPAN&gt; (i.e., it gets embedded into the logical plan of the query). Unity Catalog currently &lt;SPAN class="s1"&gt;&lt;STRONG&gt;does not support&lt;/STRONG&gt;&lt;/SPAN&gt; this style of Python UDF — even though you’re on a supported runtime (13.3 LTS+), UC adds additional restrictions for security and governance reasons.&lt;/P&gt;
&lt;P class="p1"&gt;That error:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] ...&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;is a clear indicator that the &lt;SPAN class="s1"&gt;&lt;STRONG&gt;execution path of a DataFrame with embedded Python UDFs&lt;/STRONG&gt;&lt;/SPAN&gt; is not allowed under Unity Catalog at the moment.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;🧠 The Core Issue: Unity Catalog Restrictions&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P class="p1"&gt;Unity Catalog is much stricter than the older Hive Metastore when it comes to execution context — particularly with arbitrary Python execution, which can violate the isolation/security model UC is enforcing. Python UDFs embedded inside DataFrames can execute Python code on the worker nodes in ways that UC doesn’t yet support.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; Workarounds&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P class="p1"&gt;Here’s what you can do:&lt;/P&gt;
&lt;OL start="1"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Use SQL-style UDFs via spark.udf.register(...)&lt;/STRONG&gt;&lt;SPAN class="s1"&gt; (like you did).&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Use SQL functions or Spark native functions whenever possible.&lt;/STRONG&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;For more complex logic, consider &lt;SPAN class="s1"&gt;&lt;STRONG&gt;Pandas UDFs&lt;/STRONG&gt;&lt;/SPAN&gt;, which have better support (but still limited under UC)&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":magnifying_glass_tilted_left:"&gt;🔍&lt;/span&gt; TL;DR&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;You’re not doing anything wrong — it’s a known limitation of Unity Catalog.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Python UDFs in DataFrame operations are not supported under Unity Catalog&lt;/STRONG&gt;&lt;SPAN class="s1"&gt; (even on Runtime 13.3 LTS+).&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Stick with &lt;SPAN class="s1"&gt;spark.udf.register(...)&lt;/SPAN&gt; or refactor to native Spark logic if you’re in UC.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Hope this helps. Louis.&lt;/P&gt;</description>
    <pubDate>Tue, 15 Apr 2025 01:34:04 GMT</pubDate>
    <dc:creator>Louis_Frolio</dc:creator>
    <dc:date>2025-04-15T01:34:04Z</dc:date>
    <item>
      <title>Python UDF support in Unity Catalog and runtime 13.3?</title>
      <link>https://community.databricks.com/t5/data-engineering/python-udf-support-in-unity-catalog-and-runtime-13-3/m-p/115435#M45083</link>
      <description>&lt;P&gt;Hi community,&lt;BR /&gt;I am running Databricks Unity Catalog. In the DataBricks UI, I see the Policy "&lt;SPAN&gt;shared-gp-(r6g)-small" and Runtime 13.3. (I have access to larger instances, just running a PoC on a small instance).&lt;BR /&gt;&lt;BR /&gt;Can anyone explain what looks like an inconsistency between the documentation and what I am seeing?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;The documentation &lt;A href="https://docs.databricks.com/aws/en/udf/unity-catalog" target="_self"&gt;here&lt;/A&gt; states that Python UDFs are supported in a&amp;nbsp;&lt;SPAN&gt;cluster running&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;Databricks Runtime&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;13.3 LTS or above.&lt;BR /&gt;On &lt;A href="https://docs.databricks.com/aws/en/udf/python" target="_self"&gt;this page&lt;/A&gt;, there is sample code to create a UDF in Python.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;This code take from that page works for me:&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;EM&gt;from pyspark.sql.types import LongType&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;def squared_typed(s):&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;return s * s&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&lt;EM&gt;spark.udf.register("squaredWithPython", squared_typed, LongType())&lt;/EM&gt;&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;spark&lt;SPAN class=""&gt;.&lt;/SPAN&gt;range&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;1&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;20&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;createOrReplaceTempView&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;"test"&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;/EM&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;However, when I run the next snippet of sample code from that page&lt;BR /&gt;&lt;EM&gt;from pyspark.sql.functions import udf&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;from pyspark.sql.types import LongType&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;squared_udf = udf(squared, LongType())&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;df = spark.table("test")&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;display(df.select("id", squared_udf("id").alias("id_squared")))&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;BR /&gt;I get an error:&lt;BR /&gt;&lt;EM&gt;&lt;SPAN class=""&gt;AnalysisException: &lt;/SPAN&gt;[&lt;A class="" href="https://docs.databricks.com/error-messages/uc-command-not-supported-error-class.html#without_recommendation" target="_blank" rel="noopener noreferrer"&gt;UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION&lt;/A&gt;] The command(s): org.apache.spark.sql.catalyst.expressions.PythonUDF are not supported in Unity Catalog. ; &lt;/EM&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;Is there something I'm missing about how I need to run this sampe code?&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Apr 2025 18:23:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/python-udf-support-in-unity-catalog-and-runtime-13-3/m-p/115435#M45083</guid>
      <dc:creator>AndrewBeck</dc:creator>
      <dc:date>2025-04-14T18:23:49Z</dc:date>
    </item>
    <item>
      <title>Re: Python UDF support in Unity Catalog and runtime 13.3?</title>
      <link>https://community.databricks.com/t5/data-engineering/python-udf-support-in-unity-catalog-and-runtime-13-3/m-p/115449#M45085</link>
      <description>&lt;P class="p1"&gt;Great question — and yeah, what you’re seeing &lt;I&gt;is&lt;/I&gt; a bit of a confusing experience that trips up a lot of folks working with Unity Catalog (UC). Let’s break it down:&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; What’s Working for You&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.types import LongType
def squared_typed(s):
    return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This works because you’re using a &lt;SPAN class="s1"&gt;&lt;STRONG&gt;SQL-style Python UDF&lt;/STRONG&gt;&lt;/SPAN&gt; registered directly via &lt;SPAN class="s2"&gt;spark.udf.register&lt;/SPAN&gt;, which executes outside the context of a DataFrame transformation. This approach is currently &lt;SPAN class="s1"&gt;&lt;STRONG&gt;supported&lt;/STRONG&gt;&lt;/SPAN&gt; in Unity Catalog.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":cross_mark:"&gt;❌&lt;/span&gt; What’s Failing&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This version creates a &lt;SPAN class="s1"&gt;&lt;STRONG&gt;Python UDF as a Catalyst expression&lt;/STRONG&gt;&lt;/SPAN&gt; (i.e., it gets embedded into the logical plan of the query). Unity Catalog currently &lt;SPAN class="s1"&gt;&lt;STRONG&gt;does not support&lt;/STRONG&gt;&lt;/SPAN&gt; this style of Python UDF — even though you’re on a supported runtime (13.3 LTS+), UC adds additional restrictions for security and governance reasons.&lt;/P&gt;
&lt;P class="p1"&gt;That error:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] ...&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;is a clear indicator that the &lt;SPAN class="s1"&gt;&lt;STRONG&gt;execution path of a DataFrame with embedded Python UDFs&lt;/STRONG&gt;&lt;/SPAN&gt; is not allowed under Unity Catalog at the moment.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;🧠 The Core Issue: Unity Catalog Restrictions&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P class="p1"&gt;Unity Catalog is much stricter than the older Hive Metastore when it comes to execution context — particularly with arbitrary Python execution, which can violate the isolation/security model UC is enforcing. Python UDFs embedded inside DataFrames can execute Python code on the worker nodes in ways that UC doesn’t yet support.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; Workarounds&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P class="p1"&gt;Here’s what you can do:&lt;/P&gt;
&lt;OL start="1"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Use SQL-style UDFs via spark.udf.register(...)&lt;/STRONG&gt;&lt;SPAN class="s1"&gt; (like you did).&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Use SQL functions or Spark native functions whenever possible.&lt;/STRONG&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;For more complex logic, consider &lt;SPAN class="s1"&gt;&lt;STRONG&gt;Pandas UDFs&lt;/STRONG&gt;&lt;/SPAN&gt;, which have better support (but still limited under UC)&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;span class="lia-unicode-emoji" title=":magnifying_glass_tilted_left:"&gt;🔍&lt;/span&gt; TL;DR&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;You’re not doing anything wrong — it’s a known limitation of Unity Catalog.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Python UDFs in DataFrame operations are not supported under Unity Catalog&lt;/STRONG&gt;&lt;SPAN class="s1"&gt; (even on Runtime 13.3 LTS+).&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Stick with &lt;SPAN class="s1"&gt;spark.udf.register(...)&lt;/SPAN&gt; or refactor to native Spark logic if you’re in UC.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Hope this helps. Louis.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Apr 2025 01:34:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/python-udf-support-in-unity-catalog-and-runtime-13-3/m-p/115449#M45085</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-04-15T01:34:04Z</dc:date>
    </item>
  </channel>
</rss>

