Re: org.apache.spark.SparkRuntimeException: [UDF_U...

GiriSreerangam · ‎10-29-2025

Hi Everyone

I am writing a small function, with spark read from a csv and spark write into a table. I could execute this function within the notebook. But, when I register the same function as a unity catalog function and calling it from Playground, it is throwing Spark Exception. can someone tell what am I missing?

Code for reference:

error: == Error ==
SystemExit: -1
== Stacktrace ==
File "<udfbody>", line 28, in main
return ingest_csv(csv_path, table_name)
File "<udfbody>", line 14, in ingest_csv
spark = SparkSession.builder.getOrCreate()
File "/databricks/spark/python/pyspark/sql/session.py", line 574, in getOrCreate
else SparkContext.getOrCreate(sparkConf)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/core/context.py", line 579, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/databricks/spark/python/pyspark/core/context.py", line 207, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/databricks/spark/python/pyspark/core/context.py", line 500, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
^^^^^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/java_gateway.py", line 63, in launch_gateway
SPARK_HOME = _find_spark_home()
^^^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/find_spark_home.py", line 91, in _find_spark_home
sys.exit(-1) SQLSTATE: 39000

Any help here would be of great help.

Thank you

Regards\Giri

KaushalVachhani · ‎10-30-2025

Hi @GiriSreerangam, You cannot use a Unity Catalog user-defined function (UDF) in Databricks to perform Spark read from a CSV and write to a table. Unity Catalog Python UDFs execute in a secure, isolated environment without access to the file system or the Spark context, and hence results in errors because spark home is not found.

They are intended for value transformations and can only return a single value (scalar), not perform data read/write operations or return DataFrames.

A recommended approach is to register your Spark read/write logic notebook as a Databricks Job, which can then be triggered either from the Databricks UI or programmatically via the Jobs API. If you wish to integrate this with an AI tool, you can encapsulate the Jobs API as a “function tool” to automate these workflows when needed.

View solution in original post

GiriSreerangam · ‎10-30-2025

thank you @KaushalVachhani, I want this as an AI Tool. I actually created this logic as a databricks job and used it as a tool in an agent. But it did not work as expected. Let me explore this further.

org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC]