Great question — and yeah, what you’re seeing is a bit of a confusing experience that trips up a lot of folks working with Unity Catalog (UC). Let’s break it down:
✅ What’s Working for You
from pyspark.sql.types import LongType
def squared_typed(s):
return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())
This works because you’re using a SQL-style Python UDF registered directly via spark.udf.register, which executes outside the context of a DataFrame transformation. This approach is currently supported in Unity Catalog.
❌ What’s Failing
from pyspark.sql.functions import udf
from pyspark.sql.types import LongType
squared_udf = udf(squared, LongType())
df = spark.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))
This version creates a Python UDF as a Catalyst expression (i.e., it gets embedded into the logical plan of the query). Unity Catalog currently does not support this style of Python UDF — even though you’re on a supported runtime (13.3 LTS+), UC adds additional restrictions for security and governance reasons.
That error:
AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] ...
is a clear indicator that the execution path of a DataFrame with embedded Python UDFs is not allowed under Unity Catalog at the moment.
🧠 The Core Issue: Unity Catalog Restrictions
Unity Catalog is much stricter than the older Hive Metastore when it comes to execution context — particularly with arbitrary Python execution, which can violate the isolation/security model UC is enforcing. Python UDFs embedded inside DataFrames can execute Python code on the worker nodes in ways that UC doesn’t yet support.
✅ Workarounds
Here’s what you can do:
-
Use SQL-style UDFs via spark.udf.register(...) (like you did).
-
Use SQL functions or Spark native functions whenever possible.
-
For more complex logic, consider Pandas UDFs, which have better support (but still limited under UC)
🔍 TL;DR
-
You’re not doing anything wrong — it’s a known limitation of Unity Catalog.
-
Python UDFs in DataFrame operations are not supported under Unity Catalog (even on Runtime 13.3 LTS+).
-
Stick with spark.udf.register(...) or refactor to native Spark logic if you’re in UC.
Hope this helps. Louis.