09-08-2022 06:31 PM
I have a main databricks notebook that runs a handful of functions. In this notebook, I import a helper.py file that is in my same repo and when I execute the import everything looks fine. Inside my helper.py there's a function that leverages built-in dbutils. Now back in my main notebook, when I try to execute the helper function that uses dbutils, I get an error: [NameError: name 'dbutils' is not defined]. How can I create a helper module that imports seamlessly and can leverage dbutils?
09-09-2022 06:49 AM
Looks like if I add the appropriate imports into the helper.py file then all is corrected.
from pyspark.dbutils import DBUtils
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
dbutils = DBUtils(spark)
09-09-2022 06:49 AM
Looks like if I add the appropriate imports into the helper.py file then all is corrected.
from pyspark.dbutils import DBUtils
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
dbutils = DBUtils(spark)
09-13-2022 11:42 AM
So the above resolved the issue? Please let us know if you still stuck. Thanks
09-13-2022 11:43 AM
All set. issue resolved
08-31-2023 01:51 PM
This is a little off topic, but I'm trying to run a PySpark script in VSCode via DataBricks ConnectV2:
https://www.youtube.com/watch?v=AP5dGiCU188
When I do that, I get the error mjbobak describes about dbutils not being defined.
When I use mjbobak's code or the code Elisabetta shares on SO
https://stackoverflow.com/questions/50813493/nameerror-name-dbutils-is-not-defined-in-pyspark
the error goes away, but then I get a runtime error:
"No operations allowed on this path" in response to the following dbutils.fs.ls call:
theFiles = dbutils.fs.ls("/Volumes/myTestData/shawn_test/staging/inbound")
Is there a proper way to define/import dbutils when using Connect V2 to try to debug a PySpark file that is saved locally?
12-11-2022 07:51 AM
Hi,
i 'm facing similiar issue, when deploying via dbx.
I have an helper notebook, that when executing it via jobs works fine (without any includes)
while i deploy it via dbx (to same cluster), the helper notebook results with
dbutils.fs.ls(path)
NameError: name 'dbutils' is not defined
(for main notebook, that callse the helper function notebook, i have dbutils.widgets, and it doesnt have any issue)
(dbx execute my-task --task=silver --cluster-name="my-multi-cluster": builds a wheel and deploy on the databricks cluster)
adding the includes suggesetd dont resolve the issue.
any advise?
thanks,
Amir
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group