I have the following command that runs in my databricks notebook.
spark.conf.get("spark.databricks.clusterUsageTags.managedResourceGroup")
I have wrapped this command into a function (simplified).
def get_info():
return spark.conf.get("spark.databricks.clusterUsageTags.managedResourceGroup")
I have then added this function in a .py module, that I install as a private package in the environment of my workspace. I am able to import this function and call it.
However, when I run this function, I receive an error message.
get_info()
>>> NameError: name 'spark' is not defined
If I define the same function in the body of the notebook, I can run it without problems.
- Why bringing this function to a separate module forces me to import spark? What's the proper way of creating a separate module with spark functions? How to import them?
- If possible, what is happening under the hood, that makes it work when I define the function in the notebook, but not work when I import it?