Databricks

TJS · ‎10-08-2021

Hello, I am trying to use MLFlow on a new high concurrency cluster but I get the error below. Does anyone have any suggestions? It was working before on a standard cluster. Thanks.

py4j.security.Py4JSecurityException: Method public int org.apache.spark.SparkContext.maxNumConcurrentTasks() is not whitelisted on class class org.apache.spark.SparkContext

--------------------------------------------------------------------------- Py4JError Traceback (most recent call last) <command-2769834740329298> in <module> 32 # Greater parallelism will lead to speedups, but a less optimal hyperparameter sweep. 33 # A reasonable value for parallelism is the square root of max_evals. ---> 34 spark_trials = SparkTrials(parallelism=10) 35 36 /databricks/.python_edge_libs/hyperopt/spark.py in __init__(self, parallelism, timeout, loss_threshold, spark_session) 101 ) 102 # maxNumConcurrentTasks() is a package private API --> 103 max_num_concurrent_tasks = self._spark_context._jsc.sc().maxNumConcurrentTasks() 104 spark_default_parallelism = self._spark_context.defaultParallelism 105 self.parallelism = self._decide_parallelism( /databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args) 1303 answer = self.gateway_client.send_command(command) 1304 return_value = get_return_value( -> 1305 answer, self.gateway_client, self.target_id, self.name) 1306

User16753724828 · ‎10-19-2021

@Tom Soto We have a workaround for this. This cluster spark configuration setting will disable py4jSecurity while still enabling passthrough

spark.databricks.pyspark.enablePy4JSecurity false

View solution in original post

Anonymous · ‎10-08-2021

Hello, @Tom Soto! My name is Piper and I'm a moderator for Databricks. It's great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will follow up shortly with a response.

jose_gonzalez · ‎10-08-2021

hi @Tom Soto ,

The error message is coming from your high concurrency cluster's security model. This is build-in security model to restrict access to your data. Your code might work on standard cluster but not on high concurrency clusters.

TJS · ‎10-13-2021

Thank you for your response. I appreciate this but are you aware of any work around to use a high concurrency cluster as it is a special databricks function that is the issue