cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sunil_smile
by Contributor
  • 13196 Views
  • 8 replies
  • 10 kudos

Resolved! How i can add ADLS Gen2 - OAuth 2.0 as Cluster scope for my High concurrency Shared Cluster (without unity catalog)?

Hi All,Kindly help me , how i can add the ADLS gen2 OAuth 2.0 authentication to my high concurrency shared cluster. I want to scope this authentication to entire cluster not for particular notebook.Currently i have added them as spark configuration o...

image.png image
  • 13196 Views
  • 8 replies
  • 10 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 10 kudos

error is because of missing default settings (create new cluster and do not remove them),the warning is because secrets should be put in secret scope, and then you should reference secrets in settings

  • 10 kudos
7 More Replies
LukaszJ
by Contributor III
  • 4203 Views
  • 5 replies
  • 4 kudos

Resolved! Mount Azure Blob Storage with Cluster access control

Hello.I want to mount and share for the one group the container from Azure Blob Storage (It could be simple blob storage or Azure Data Lake Storage gen 2). But I am not able to do it because I am using Cluster with Table Access Control.This is my cod...

  • 4203 Views
  • 5 replies
  • 4 kudos
Latest Reply
LukaszJ
Contributor III
  • 4 kudos

I have a good solution to the problem:I am using Python library.There are some documentation.Topic to be closed.Best regards,Łukasz

  • 4 kudos
4 More Replies
LukaszJ
by Contributor III
  • 1829 Views
  • 2 replies
  • 1 kudos

Table access control cluster with R language

Hello,I want to have a high concurrency cluster with table access control and I want to use R language on it.I know that the documentation says that R and Scala is not available with table access control.But maybe you have some tricks or best practic...

  • 1829 Views
  • 2 replies
  • 1 kudos
Latest Reply
Aashita
Databricks Employee
  • 1 kudos

@Łukasz Jaremek​, Currently it is only available in Python and SQL.

  • 1 kudos
1 More Replies
Tahseen0354
by Valued Contributor
  • 3110 Views
  • 4 replies
  • 2 kudos

Resolved! A Standard cluster is recommended for a single user - what is meant by that ?

Hi, I have seen it written in the documentation that standard cluster is recommended for a single user. But why ? What is meant by that ? Me and one of my colleagues were testing it on the same notebook. Both of us can use the same standard all purpo...

  • 3110 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

High concurrency cluster just split resource between users more evenly. So when 4 people run notebooks in the same time on cluster with 4 cpu you can imagine that every will get 1 cpu. In standard cluster 1 person could utilize all worker cpus as you...

  • 2 kudos
3 More Replies
TJS
by New Contributor II
  • 16781 Views
  • 6 replies
  • 5 kudos

Resolved! Can you help with this error please? Issue when using a new high concurrency cluster

Hello, I am trying to use MLFlow on a new high concurrency cluster but I get the error below. Does anyone have any suggestions? It was working before on a standard cluster. Thanks.py4j.security.Py4JSecurityException: Method public int org.apache.spar...

  • 16781 Views
  • 6 replies
  • 5 kudos
Latest Reply
Pradeep54
Databricks Employee
  • 5 kudos

@Tom Soto​ We have a workaround for this. This cluster spark configuration setting will disable py4jSecurity while still enabling passthrough spark.databricks.pyspark.enablePy4JSecurity false

  • 5 kudos
5 More Replies
DouglasLinder
by New Contributor III
  • 10435 Views
  • 4 replies
  • 1 kudos

Is it possible to pass configuration to a job on high concurrency cluster?

On a regular cluster, you can use:```spark.sparkContext._jsc.hadoopConfiguration().set(key, value)```These values are then available on the executors using the hadoop configuration. However, on a high concurrency cluster, attempting to do so results ...

  • 10435 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 1 kudos

I am not sure why you are getting that error on a high concurrency cluster. As I am able to set the configuration as you show above. Can you try the following code instead? sc._jsc.hadoopConfiguration().set(key, value)

  • 1 kudos
3 More Replies
brickster_2018
by Databricks Employee
  • 3080 Views
  • 1 replies
  • 0 kudos

Resolved! Super slow SQL queries on an HC cluster

I have a high concurrency cluster where multiple users are running. However, I see the queries are running very slow. I did debug the logs and see more time is spent on the Spark driver. on the Spark UI, I do not see slowness.

  • 3080 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's possible the connectivity to hive metastore is causing the delay here. When there is a high degree of concurrency and contention for metastore access. Interactive clusters in DBR are configured to use up to 5 (spark.databricks.hive.metastore.cli...

  • 0 kudos
Anonymous
by Not applicable
  • 966 Views
  • 1 replies
  • 2 kudos
  • 966 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 2 kudos

Scala Use JVM to run its code, Scala cannot run different applications at a time with complete isolation of each task inside single jvm , that is the reason Scala doesn't support high concurrency cluster, I don't think it is on road map

  • 2 kudos
Labels