cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

ModuleNotFoundError: No module named 'pulp'

YS1
Contributor

Hello,

I'm encountering an issue while running a notebook that utilizes the Pulp library. The library is installed in the first cell of the notebook. Occasionally, I encounter the following error:

 

 

org.apache.spark.SparkException: Job aborted due to stage failure: Task 92 in stage 51.0 failed 4 times, most recent failure: Lost task 92.3 in stage 51.0 (TID 4465) (10.153.242.115 executor 4): org.apache.spark.SparkException: Task failed while writing rows.

During handling of the above exception, another exception occurred: pyspark.serializers.SerializationError: Caused by Traceback (most recent call last): File "/databricks/spark/python/pyspark/serializers.py", line 188, in _read_with_length return self.loads(obj) File "/databricks/spark/python/pyspark/serializers.py", line 540, in loads return cloudpickle.loads(obj, encoding=encoding) ModuleNotFoundError: No module named 'pulp'

 

 

What's puzzling is that rerunning the code often succeeds. Could anyone provide insight into why this intermittent issue might be occurring?

Thanks.

1 REPLY 1

I've double-checked, and the Pulp library is correctly installed. However, I'm still encountering the intermittent 'No module named 'pulp'' error, which is perplexing.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group