Great question ā and yeah, what youāre seeing is a bit of a confusing experience that trips up a lot of folks working with Unity Catalog (UC). Letās break it down:
ā
Whatās Working for You
from pyspark.sql.types import LongType
def squared_typed(s):
return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())
This works because youāre using a SQL-style Python UDF registered directly via spark.udf.register, which executes outside the context of a DataFrame transformation. This approach is currently supported in Unity Catalog.
ā Whatās Failing
from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))
This version creates a Python UDF as a Catalyst expression (i.e., it gets embedded into the logical plan of the query). Unity Catalog currently does not support this style of Python UDF ā even though youāre on a supported runtime (13.3 LTS+), UC adds additional restrictions for security and governance reasons.
That error:
AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] ...
is a clear indicator that the execution path of a DataFrame with embedded Python UDFs is not allowed under Unity Catalog at the moment.
š§ The Core Issue: Unity Catalog Restrictions
Unity Catalog is much stricter than the older Hive Metastore when it comes to execution context ā particularly with arbitrary Python execution, which can violate the isolation/security model UC is enforcing. Python UDFs embedded inside DataFrames can execute Python code on the worker nodes in ways that UC doesnāt yet support.
ā
Workarounds
Hereās what you can do:
-
Use SQL-style UDFs via spark.udf.register(...) (like you did).
-
Use SQL functions or Spark native functions whenever possible.
-
For more complex logic, consider Pandas UDFs, which have better support (but still limited under UC)
š TL;DR
-
Youāre not doing anything wrong ā itās a known limitation of Unity Catalog.
-
Python UDFs in DataFrame operations are not supported under Unity Catalog (even on Runtime 13.3 LTS+).
-
Stick with spark.udf.register(...) or refactor to native Spark logic if youāre in UC.
Hope this helps. Louis.