@hyunho leeโ : It sounds like you are encountering an issue with PySpark's serializer not being able to find the necessary modules during testing with Pytest. One solution you could try is to set the PYTHONPATH
environment variable to include the path to your PySpark installation before running Pytest. This can be done by adding the following line to your test script before running Pytest:
import os
os.environ['PYTHONPATH'] = '/path/to/pyspark'
Replace /path/to/pyspark with the actual path to your PySpark installation directory.
Another solution you could try is to use the PYSPARK_PYTHON environment variable to specify the Python executable to be used by PySpark. You can set this variable to the Python executable you used to install PySpark. For example:
import os
os.environ['PYSPARK_PYTHON'] = '/path/to/python'
Replace /path/to/python with the actual path to your Python executable.
I hope this helps! Let me know if you have any further questions.