Cannot use RDD and cannot set "spark.databricks.pyspark.enablePy4JSecurity false" for cluster

I have been using "rdd.flatMap(lambda x:x)" for a while to create lists from columns however after I have changed the cluster to a Shared acess mode (to use unity catalog) I get the following error: Method public org.apache.spark.rdd.RDD is not whitelisted on class class

I have tried to solve the error by adding:

"spark.databricks.pyspark.enablePy4JSecurity false"

however I then get the following error:

"spark.databricks.pyspark.enablePy4JSecurity is not allowed when chossing an access mode"

Does anybody know how to use RDD when using a cluster for unity catalouge?

Thank you!


I was having a similar issue in using
Solved it by adding two key value pairs in the spark config for the cluster

spark.databricks.pyspark.enablePy4JSecurity false



After this I was able to read the schema of the json from the column that was read as string 

    json_schema = row: row.preferences)).schema

Did you tried this in a UC enabled cluster?

In my case the problem was that we were trying to use SparkXGBoostRegressor and in the docs it says that it does not work on clusters with autoscaling enabled. So we just disabled autoscaling for the interactive cluster where we were testing the model and it worked like a charm 🙂


Hope it helps

