ModuleNotFoundError: No module named 'pyspark.dbutils'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-25-2023 03:10 PM - edited 09-25-2023 03:23 PM
I have a class in a python file like this
from pyspark.sql import SparkSession
from pyspark.dbutils import DBUtils
class DatabricksUtils:
def __init__(self):
self.spark = SparkSession.getActiveSession()
self.dbutils = DBUtils(self.spark)
def get_dbutils(self) -> DBUtils:
return self.dbutils
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-28-2023 11:58 AM
Hey @vk217 ,
What Databricks runtime version was the cluster you ran the code on? I was able to successfully run your code in a 12.2 and a 13.3 cluster.
Can you try running it on a cluster with one of those DBR versions, if you haven't already? And please let us know if you're still running into issues.
Best,
Miguel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-16-2023 04:43 PM
We are trying to do similar thing . we are using dbutils to get secret scope and we doing unittesting in azure pipeline where it gives error pyspark.dbutils not found
We have tried using databricks.sdk.dbutils for which I got authentication error li kr value not found I used databricks -connect library as well but still the same issue..
Can you please help with it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2024 08:35 PM
Hi, we are also in the same exact situation. Were you able to solve the problem? Or a workaround maybe.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-26-2025 04:26 AM - edited 02-26-2025 04:27 AM
Had the same Problem in my GitLab CI/CD Pipeline while trying to deploy:
$ databricks bundle deploy -t dev
Building package...
Error: build failed package, error: exit status 1, output: Traceback (most recent call last):
[...]
File "/builds/user/package/./src/package/main.py", line 2, in <module>
from pyspark.dbutils import DBUtils
ModuleNotFoundError: No module named 'pyspark.dbutils'
Solved it by completing the requirements.txt with
- ipykernel>=6.29.4
- nbformat>=5.10.4
- databricks-connect>=13.1.0
as seen here: https://github.com/databricks/dais-cow-bff/blob/dais24-main/requirements.txt