cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

ModuleNotFoundError when run with foreachBatch on serverless mode

fury-kata
New Contributor II

I using Notebooks to do some transformations 

I install a new whl:

 

 

%pip install --force-reinstall /Workspace/<my_lib>.whl
%restart_python

 

 

Then I  successfully import the installed lib

 

 

from my_lib.core import test

 

 

However when I run my code with foreachBatch it raises ModuleNotFoundError: No module named 'my_lib'.

This is my code:

 

 

from my_lib.utils import clogs
logs = clogs.logs()
def _test(df, b):
    logs.add_logs('test')

mystream =  spark.readStream\
                                    .table('my_tbale') \
                                .writeStream\
                                    .format("delta")\
                                    .foreachBatch(_test)\
                                    .trigger(once=True) \
                                .start()
mystream.awaitTermination()
streaming_silver.awaitTermination()

 

 

It raises an error: ModuleNotFoundError: No module named 'my_lib'.

Please help

 

1 REPLY 1

Thank @Retired_mod for your response.

Today, I re-run my job again, without any changes. It doesn’t raise module not found my_lib as I mentioned above, but it raises the Access Denied on my S3 bucket. I don't see anywhere to set my IAMr or instance profile on the serverless as I did with provision compute.

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now