Databricks Community

AndrewBeck · ‎04-14-2025

Hi community,
I am running Databricks Unity Catalog. In the DataBricks UI, I see the Policy "shared-gp-(r6g)-small" and Runtime 13.3. (I have access to larger instances, just running a PoC on a small instance).

Can anyone explain what looks like an inconsistency between the documentation and what I am seeing?

The documentation here states that Python UDFs are supported in a cluster running Databricks Runtime 13.3 LTS or above.
On this page, there is sample code to create a UDF in Python.

This code take from that page works for me:

from pyspark.sql.types import LongType

def squared_typed(s):

return s * s

spark.udf.register("squaredWithPython", squared_typed, LongType())

spark.range(1, 20).createOrReplaceTempView("test")

However, when I run the next snippet of sample code from that page
from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))

I get an error:
AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] The command(s): org.apache.spark.sql.catalyst.expressions.PythonUDF are not supported in Unity Catalog. ;

Is there something I'm missing about how I need to run this sampe code?

Louis_Frolio · ‎04-14-2025

Great question — and yeah, what you’re seeing is a bit of a confusing experience that trips up a lot of folks working with Unity Catalog (UC). Let’s break it down:

✅ What’s Working for You

from pyspark.sql.types import LongType
def squared_typed(s):
    return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())

This works because you’re using a SQL-style Python UDF registered directly via spark.udf.register, which executes outside the context of a DataFrame transformation. This approach is currently supported in Unity Catalog.

❌ What’s Failing

from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))

This version creates a Python UDF as a Catalyst expression (i.e., it gets embedded into the logical plan of the query). Unity Catalog currently does not support this style of Python UDF — even though you’re on a supported runtime (13.3 LTS+), UC adds additional restrictions for security and governance reasons.

That error:

AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] ...

is a clear indicator that the execution path of a DataFrame with embedded Python UDFs is not allowed under Unity Catalog at the moment.

🧠 The Core Issue: Unity Catalog Restrictions

Unity Catalog is much stricter than the older Hive Metastore when it comes to execution context — particularly with arbitrary Python execution, which can violate the isolation/security model UC is enforcing. Python UDFs embedded inside DataFrames can execute Python code on the worker nodes in ways that UC doesn’t yet support.

✅ Workarounds

Here’s what you can do:

Use SQL-style UDFs via spark.udf.register(...) (like you did).
Use SQL functions or Spark native functions whenever possible.
For more complex logic, consider Pandas UDFs, which have better support (but still limited under UC)

🔍 TL;DR

You’re not doing anything wrong — it’s a known limitation of Unity Catalog.
Python UDFs in DataFrame operations are not supported under Unity Catalog (even on Runtime 13.3 LTS+).
Stick with spark.udf.register(...) or refactor to native Spark logic if you’re in UC.

Hope this helps. Louis.