ModuleNotFoundError: No module named 'pyspark.dbutils'

vk217
Contributor

I have a class in a python file like this

 

from pyspark.sql import SparkSession
from pyspark.dbutils import DBUtils
 class DatabricksUtils:

      def __init__(self‌‌):
        self.spark = SparkSession.getActiveSession()
        self.dbutils = DBUtils(self.spark)
     
     def get_dbutils(self) -> DBUtils:
        return self.dbutils

 

 

In another python file, I am importing this module and calling the db utils like
 
 from .myProject.functions.utils import *
 
    db = DatabricksUtils()
    dbutils = db.get_dbutils()
 
This works when I test it locally in vscode but in Azure Pipeline when I try to build my unit tests fail and I get a message
ModuleNotFoundError: No module named 'pyspark.dbutils'