Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-06-2024 01:41 PM
When you run a Python function in a Python cell, it executes in the local Python environment of the notebook. However, when you call a Python function from a SQL cell, it runs as a UDF within the Spark execution environment.
You need to define the function as a UDF explicitly if you want to use it within SQL cells. This involves using the pyspark.sql.functions.udf decorator to register the function
# Register the UDF with Spark
spark.udf.register("get_useragent_string", get_useragent_string_udf)
After registering the function as a UDF, you can call it from a SQL cell:
%sql
SELECT get_useragent_string('xxx', 49738, '104.16.184.241', 1730235533)