Databricks Community

Rajaniesh · ‎06-22-2023

Hi,

I have created a python wheel with the following code. And the package name is rule_engine

"""

The entry point of the Python Wheel

"""

import sys

from pyspark.sql.functions import expr, col

def get_rules(tag):

"""

loads data quality rules from a table

:param tag: tag to match

:return: dictionary of rules that matched the tag

"""

rules = {}

df = spark.read.table("rules")

for row in df.filter(col("tag") == tag).collect():

rules[row['name']] = row['constraint']

return rules

def get_quarantine_rules(tag):

"""

loads data quality rules from a table

:param tag: tag to match

:return: dictionary of rules that matched the tag

"""

all_rules_in_tags=get_rules(tag)

qurantine_rule="NOT({0})".format(" AND ".join(all_rules_in_tags.values()))

return qurantine_rule

Now after I install it into Databricks Cluster and then import it so I can call the function defined into it.

import rule_engine

rule_dict=rule_engine.get_quarantine_rules("maintained")

It throws this error:

NameError Traceback (most recent call last)

<command-502204870200978> in <cell line: 2>()

1 import rule_engine

----> 2 rule_dict=rule_engine.get_quarantine_rules("maintained")

/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/rule_engine/functions.py in get_quarantine_rules(tag)

27 :return: dictionary of rules that matched the tag

28 """

---> 29 all_rules_in_tags=get_rules(tag)

30 qurantine_rule="NOT({0})".format(" AND ".join(all_rules_in_tags.values()))

31 return qurantine_rule

/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/rule_engine/functions.py in get_rules(tag)

15 """

16 rules = {}

---> 17 df = spark.read.table("rules")

18 for row in df.filter(col("tag") == tag).collect():

19 rules[row['name']] = row['constraint']

NameError: name 'spark' is not defined

Regards

Rajaniesh

Anonymous · ‎06-23-2023

Hi @Rajaniesh Kaushikk

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

jose_gonzalez · ‎12-12-2023

You can find more details and examples here https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#use-a-pytho...

Databricks Community

URGENT HELP NEEDED: Python functions deployed in the cluster throwing the error

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!