10-08-2021 08:24 AM
Hello, I am trying to use MLFlow on a new high concurrency cluster but I get the error below. Does anyone have any suggestions? It was working before on a standard cluster. Thanks.
py4j.security.Py4JSecurityException: Method public int org.apache.spark.SparkContext.maxNumConcurrentTasks() is not whitelisted on class class org.apache.spark.SparkContext
--------------------------------------------------------------------------- Py4JError Traceback (most recent call last) <command-2769834740329298> in <module> 32 # Greater parallelism will lead to speedups, but a less optimal hyperparameter sweep. 33 # A reasonable value for parallelism is the square root of max_evals. ---> 34 spark_trials = SparkTrials(parallelism=10) 35 36 /databricks/.python_edge_libs/hyperopt/spark.py in __init__(self, parallelism, timeout, loss_threshold, spark_session) 101 ) 102 # maxNumConcurrentTasks() is a package private API --> 103 max_num_concurrent_tasks = self._spark_context._jsc.sc().maxNumConcurrentTasks() 104 spark_default_parallelism = self._spark_context.defaultParallelism 105 self.parallelism = self._decide_parallelism( /databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args) 1303 answer = self.gateway_client.send_command(command) 1304 return_value = get_return_value( -> 1305 answer, self.gateway_client, self.target_id, self.name) 1306
10-19-2021 06:05 AM
@Tom Soto We have a workaround for this. This cluster spark configuration setting will disable py4jSecurity while still enabling passthrough
spark.databricks.pyspark.enablePy4JSecurity false
10-08-2021 01:44 PM
Hello, @Tom Soto! My name is Piper and I'm a moderator for Databricks. It's great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will follow up shortly with a response.
10-08-2021 04:42 PM
hi @Tom Soto ,
The error message is coming from your high concurrency cluster's security model. This is build-in security model to restrict access to your data. Your code might work on standard cluster but not on high concurrency clusters.
10-13-2021 08:33 AM
Thank you for your response. I appreciate this but are you aware of any work around to use a high concurrency cluster as it is a special databricks function that is the issue
10-19-2021 06:05 AM
@Tom Soto We have a workaround for this. This cluster spark configuration setting will disable py4jSecurity while still enabling passthrough
spark.databricks.pyspark.enablePy4JSecurity false
10-29-2021 06:36 AM
Thank you very much. This workaround worked for me.
10-29-2021 03:04 PM
@Tom Soto - If Pradpalnis fully answered your question, would you be happy to mark their answer as best so that others can quickly find the solution?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group