Hello:)
as part of deploying an app that previously ran directly on emr to databricks, we are running experiments using LTS 9.1, and getting the following error:
PythonException: An exception was thrown from a UDF: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 165, in _read_with_length
return self.loads(obj)
File "/databricks/spark/python/pyspark/serializers.py", line 469, in loads
return pickle.loads(obj, encoding=encoding)
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 844, in exec_module
File "<frozen importlib._bootstrap_external>", line 980, in get_code
File "<frozen importlib._bootstrap_external>", line 1038, in get_data
OSError: [Errno 78] Remote address changed'. Full traceback below:
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 165, in _read_with_length
return self.loads(obj)
File "/databricks/spark/python/pyspark/serializers.py", line 469, in loads
return pickle.loads(obj, encoding=encoding)
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 844, in exec_module
File "<frozen importlib._bootstrap_external>", line 980, in get_code
File "<frozen importlib._bootstrap_external>", line 1038, in get_data
OSError: [Errno 78] Remote address changed
the same udf works in emr, and is part of the codebase that databricks uses as git source,
did anyone encounter this error?
Would appreciate any advice in troubleshooting it, Thanks!