<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic UDF function while registering- PicklingError in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/udf-function-while-registering-picklingerror/m-p/15760#M10036</link>
    <description>&lt;P&gt;PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to write a function in Azure databricks. I would like to spark.sql inside the function. But it looks like I cannot use it with worker nodes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;def SEL_ID(value, index):&lt;/LI&gt;&lt;LI&gt;    # some processing on value here&lt;/LI&gt;&lt;LI&gt;    ans = spark.sql("SELECT id FROM table WHERE bin = index")&lt;/LI&gt;&lt;LI&gt;    return ans&lt;/LI&gt;&lt;LI&gt;spark.udf.register("SEL_ID", SEL_ID)&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am getting the following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any way I can overcome this? I am using the above function to select from another table.&lt;/P&gt;</description>
    <pubDate>Tue, 20 Dec 2022 12:30:31 GMT</pubDate>
    <dc:creator>databricks_amit</dc:creator>
    <dc:date>2022-12-20T12:30:31Z</dc:date>
    <item>
      <title>UDF function while registering- PicklingError</title>
      <link>https://community.databricks.com/t5/data-engineering/udf-function-while-registering-picklingerror/m-p/15760#M10036</link>
      <description>&lt;P&gt;PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to write a function in Azure databricks. I would like to spark.sql inside the function. But it looks like I cannot use it with worker nodes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;def SEL_ID(value, index):&lt;/LI&gt;&lt;LI&gt;    # some processing on value here&lt;/LI&gt;&lt;LI&gt;    ans = spark.sql("SELECT id FROM table WHERE bin = index")&lt;/LI&gt;&lt;LI&gt;    return ans&lt;/LI&gt;&lt;LI&gt;spark.udf.register("SEL_ID", SEL_ID)&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am getting the following error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any way I can overcome this? I am using the above function to select from another table.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Dec 2022 12:30:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/udf-function-while-registering-picklingerror/m-p/15760#M10036</guid>
      <dc:creator>databricks_amit</dc:creator>
      <dc:date>2022-12-20T12:30:31Z</dc:date>
    </item>
  </channel>
</rss>

