<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: URGENT HELP NEEDED: Python functions deployed in the cluster throwing the error in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/2681#M12</link>
    <description>&lt;P&gt;Hi @Rajaniesh Kaushikk​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Great to meet you, and thanks for your question! &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Let's see if your peers in the community have an answer to your question. Thanks.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 23 Jun 2023 07:21:32 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-06-23T07:21:32Z</dc:date>
    <item>
      <title>URGENT HELP NEEDED: Python functions deployed in the cluster throwing the error</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/2680#M11</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have created a python wheel with the following code. And the package name is rule_engine&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;"""&lt;/P&gt;&lt;P&gt;The entry point of the Python Wheel&lt;/P&gt;&lt;P&gt;"""&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;import sys&lt;/P&gt;&lt;P&gt;from pyspark.sql.functions import expr, col&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;def get_rules(tag):&lt;/P&gt;&lt;P&gt;&amp;nbsp;"""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;loads data quality rules from a table&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;:param tag: tag to match&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;:return: dictionary of rules that matched the tag&lt;/P&gt;&lt;P&gt;&amp;nbsp;"""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;rules = {}&lt;/P&gt;&lt;P&gt;&amp;nbsp;df = spark.read.table("rules")&lt;/P&gt;&lt;P&gt;&amp;nbsp;for row in df.filter(col("tag") == tag).collect():&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;rules[row['name']] = row['constraint']&lt;/P&gt;&lt;P&gt;&amp;nbsp;return rules&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;def get_quarantine_rules(tag):&lt;/P&gt;&lt;P&gt;&amp;nbsp;"""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;loads data quality rules from a table&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;:param tag: tag to match&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;:return: dictionary of rules that matched the tag&lt;/P&gt;&lt;P&gt;&amp;nbsp;"""&lt;/P&gt;&lt;P&gt;&amp;nbsp;all_rules_in_tags=get_rules(tag)&lt;/P&gt;&lt;P&gt;&amp;nbsp;qurantine_rule="NOT({0})".format(" AND ".join(all_rules_in_tags.values()))&lt;/P&gt;&lt;P&gt;&amp;nbsp;return qurantine_rule&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now after I install it into Databricks Cluster and then import it so I can call the function defined into it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;import rule_engine &lt;/P&gt;&lt;P&gt;rule_dict=rule_engine.get_quarantine_rules("maintained")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It throws this error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NameError                                 Traceback (most recent call last)&lt;/P&gt;&lt;P&gt;&amp;lt;command-502204870200978&amp;gt; in &amp;lt;cell line: 2&amp;gt;()&lt;/P&gt;&lt;P&gt;      1 import rule_engine&lt;/P&gt;&lt;P&gt;----&amp;gt; 2 rule_dict=rule_engine.get_quarantine_rules("maintained")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/rule_engine/&lt;A href="https://functions.py" alt="https://functions.py" target="_blank"&gt;functions.py&lt;/A&gt; in get_quarantine_rules(tag)&lt;/P&gt;&lt;P&gt;     27     :return: dictionary of rules that matched the tag&lt;/P&gt;&lt;P&gt;     28   """&lt;/P&gt;&lt;P&gt;---&amp;gt; 29   all_rules_in_tags=get_rules(tag)&lt;/P&gt;&lt;P&gt;     30   qurantine_rule="NOT({0})".format(" AND ".join(all_rules_in_tags.values()))&lt;/P&gt;&lt;P&gt;     31   return qurantine_rule&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/rule_engine/&lt;A href="https://functions.py" alt="https://functions.py" target="_blank"&gt;functions.py&lt;/A&gt; in get_rules(tag)&lt;/P&gt;&lt;P&gt;     15   """&lt;/P&gt;&lt;P&gt;     16   rules = {}&lt;/P&gt;&lt;P&gt;---&amp;gt; 17   df = spark.read.table("rules")&lt;/P&gt;&lt;P&gt;     18   for row in df.filter(col("tag") == tag).collect():&lt;/P&gt;&lt;P&gt;     19     rules[row['name']] = row['constraint']&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NameError: name 'spark' is not defined&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Rajaniesh&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jun 2023 04:45:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/2680#M11</guid>
      <dc:creator>Rajaniesh</dc:creator>
      <dc:date>2023-06-23T04:45:40Z</dc:date>
    </item>
    <item>
      <title>Re: URGENT HELP NEEDED: Python functions deployed in the cluster throwing the error</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/2681#M12</link>
      <description>&lt;P&gt;Hi @Rajaniesh Kaushikk​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Great to meet you, and thanks for your question! &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Let's see if your peers in the community have an answer to your question. Thanks.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jun 2023 07:21:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/2681#M12</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-06-23T07:21:32Z</dc:date>
    </item>
    <item>
      <title>Re: URGENT HELP NEEDED: Python functions deployed in the cluster throwing the error</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/55157#M30248</link>
      <description>&lt;P&gt;You can find more details and examples here&amp;nbsp;&lt;A href="https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#use-a-python-wheel-in-a-databricks-job" target="_blank"&gt;https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#use-a-python-wheel-in-a-databricks-job&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Dec 2023 19:25:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-help-needed-python-functions-deployed-in-the-cluster/m-p/55157#M30248</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2023-12-12T19:25:09Z</dc:date>
    </item>
  </channel>
</rss>

