[Pyspark.Pandas] PicklingError: Could not serialize object (this error is happening only for large datasets)
Context: I am using pyspark.pandas in a Databricks jupyter notebook and doing some text manipulation within the dataframe..pyspark.pandas is the Pandas API on Spark and can be used exactly the same as usual PandasError: PicklingError: Could not seria...
- 15773 Views
- 3 replies
- 3 kudos
Latest Reply
@Krishna Zanwar​ , i'm receiving the same error.​For me, the behavior is when trying to broadcast a random forest (sklearn 1.2.0) recently loaded from mlflow, and using Pandas UDF to predict a model.​However, the same code works perfectly on Spark 2....
- 3 kudos