ModuleNotFoundError when run with foreachBatch on serverless mode

fury-kata
New Contributor II

I using Notebooks to do some transformations 

I install a new whl:

 

 

%pip install --force-reinstall /Workspace/<my_lib>.whl
%restart_python

 

 

Then I  successfully import the installed lib

 

 

from my_lib.core import test

 

 

However when I run my code with foreachBatch it raises ModuleNotFoundError: No module named 'my_lib'.

This is my code:

 

 

from my_lib.utils import clogs
logs = clogs.logs()
def _test(df, b):
    logs.add_logs('test')

mystream =  spark.readStream\
                                    .table('my_tbale') \
                                .writeStream\
                                    .format("delta")\
                                    .foreachBatch(_test)\
                                    .trigger(once=True) \
                                .start()
mystream.awaitTermination()
streaming_silver.awaitTermination()

 

 

It raises an error: ModuleNotFoundError: No module named 'my_lib'.

Please help

 

Thank @Retired_mod for your response.

Today, I re-run my job again, without any changes. It doesn’t raise module not found my_lib as I mentioned above, but it raises the Access Denied on my S3 bucket. I don't see anywhere to set my IAMr or instance profile on the serverless as I did with provision compute.