Hello @Yuan Gao ,
On Databricks, spark and dbutils are automatically injected only into the main entrypoint - your notebook, but they aren't propagated to the Python modules. With spark solution is easy, just use the getActiveSession function of SparkSession class (as SparkSession.getActiveSession()), but you need to continue to pass dbutils explicitly until you don't abstract getting dbutils into some function
The documentation for Databricks Connect shows an example of how it could be achieved. That example has SparkSession as an explicit parameter, but it could be modified to avoid that completely, with something like this:
def get_dbutils():
from pyspark.sql import SparkSession
spark = SparkSession.getActiveSession()
if spark.conf.get("spark.databricks.service.client.enabled") == "true":
from pyspark.dbutils import DBUtils
return DBUtils(spark)
else:
import IPython
return IPython.get_ipython().user_ns["dbutils"]
and then in your function, you can use the main function to get the spark dbutils functionality