Topics with Label: Cluster management

Forum Posts

Sorted by:

Start a conversation

by User16869510359 • Esteemed Contributor

06-25-2021 3:33:57 PM

867 Views
1 replies
0 kudos

What is the advantage of using the new Global init script compared to the legacy init script

Data Engineering

867 Views
1 replies
0 kudos

06-25-2021 3:33:57 PM

View Replies

Latest Reply

aladda
Honored Contributor II

06-25-2021 3:57:44 PM

0 kudos

Global: run on every cluster in the workspace. They can help you to enforce consistent cluster configurations across your workspace. Use them carefully because they can cause unanticipated impacts, like library conflicts. Only admin users can create ...

0 kudos

06-25-2021 3:57:44 PM

by User16783853906 • Contributor III

06-07-2021 12:05:03 PM

1518 Views
3 replies
0 kudos

Resolved! Frequent spot loss of driver nodes resulting in failed jobs when using spot fleet pools

When using spot fleet pools to schedule jobs, driver and worker nodes are provisioned from the spot pools and we are noticing jobs failing with the below exception when there is a driver spot loss. Share best practices around using fleet pools with 1...

Data Engineering

1518 Views
3 replies
0 kudos

06-07-2021 12:05:03 PM

View Replies

Latest Reply

User16783853906
Contributor III

06-23-2021 2:20:55 PM

0 kudos

In this scenario, the driver node is reclaimed by AWS. Databricks started preview of hybrid pools feature which would allow you to provision driver node from a different pool. We recommend using on-demand pool for driver node to improve reliability i...

0 kudos

06-23-2021 2:20:55 PM

2 More Replies

by User16869510359 • Esteemed Contributor

06-25-2021 10:41:38 AM

1338 Views
1 replies
0 kudos

Resolved! How to uninstall libraries that are set to auto-install on all cluster - using REST API

I have a bunch of libraries that I want to uninstall. All of them are marked as auto-install.

Data Engineering

1338 Views
1 replies
0 kudos

06-25-2021 10:41:38 AM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-25-2021 10:43:20 AM

0 kudos

1) Find the corresponding library definition from an existing cluster using "libraries/cluster-status?cluster_id=<cluster_id>".$ curl -X GET 'https://cust-success.cloud.databricks.com/api/2.0/libraries/cluster-status?cluster_id=1226-232931-cuffs129' ...

0 kudos

06-25-2021 10:43:20 AM

by User16869510359 • Esteemed Contributor

06-25-2021 10:21:55 AM

1750 Views
1 replies
0 kudos

Resolved! I can't find my cluster

I had a cluster that I used in the past. I do not see the cluster any longer. I checked with the admin and my team and everyone confirmed that there no user deletion.

Data Engineering

1750 Views
1 replies
0 kudos

06-25-2021 10:21:55 AM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-25-2021 10:22:13 AM

0 kudos

If the cluster is unsued for 30 days, Databricks removes the cluster. This is a general clean-up policy. It's possible to whitelist a cluster from this clean-up by Pinning the cluster. https://docs.databricks.com/clusters/clusters-manage.html#pin-a-c...

0 kudos

06-25-2021 10:22:13 AM

by User16869510359 • Esteemed Contributor

06-25-2021 7:07:10 AM

1292 Views
1 replies
0 kudos

Resolved! How to restart the cluster with new instances?

Whenever I restart a Databricks cluster new instances are not launched. This is because Databricks re-uses the instances. However, sometimes it's needed to launch new instances. Some scenarios are to mitigate a bad VM issue or maybe to get a patch fr...

Data Engineering

1292 Views
1 replies
0 kudos

06-25-2021 7:07:10 AM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-25-2021 7:07:34 AM

0 kudos

Currently, there is no direct option to restart the cluster with new instances. An easy hack to ensure new instances are launched is to add Cluster tags on your cluster. This will ensure that new instances have to be acquired as it's not possible to ...

0 kudos

06-25-2021 7:07:34 AM

by User16869510359 • Esteemed Contributor

06-24-2021 11:10:13 AM

1022 Views
1 replies
1 kudos

Resolved! Cluster logs missing

On the Databricks cluster UI, when I click on the Driver logs, sometimes I see historic logs and sometimes I see logs for the last few hours. Why do we see this inconsistency

Data Engineering

1022 Views
1 replies
1 kudos

06-24-2021 11:10:13 AM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-24-2021 11:12:12 AM

1 kudos

This is working per design! This is the expected behavior. When the cluster is in terminated state, the logs are serviced by the Spark History server hosted on the Databricks control plane. When the cluster is up and running the logs are serviced by ...

1 kudos

06-24-2021 11:12:12 AM

by User16869510359 • Esteemed Contributor

06-23-2021 11:32:56 PM

1053 Views
1 replies
0 kudos

Resolved! I do not have any Spark jobs running, but my cluster is not getting auto-terminated.

The cluster is Idle and there are no Spark jobs running on the Spark UI. Still I see my cluster is active and not getting terminated.

Data Engineering

1053 Views
1 replies
0 kudos

06-23-2021 11:32:56 PM

View Replies

Latest Reply

User16869510359
Esteemed Contributor

06-23-2021 11:45:13 PM

0 kudos

Databricks cluster is treated as active if there are any spark or non-Spark operations running on the cluster. Even though there are no Spark jobs running on the cluster, it's possible to have some driver-specific application code running marking th...

0 kudos

06-23-2021 11:45:13 PM

by User16826994223 • Honored Contributor III

06-08-2021 4:54:32 AM

2678 Views
1 replies
1 kudos

Resolved! cluster start Issues

Some of the Jobs are failing in prod with below error message:Can you please check and let us know the reason for this? These are running under pool cluster.Run result unavailable: job failed with error messageUnexpected failure while waiting for the...

Data Engineering

2678 Views
1 replies
1 kudos

06-08-2021 4:54:32 AM

View Replies

Latest Reply

Mooune_DBU
Valued Contributor

06-18-2021 4:58:39 PM

1 kudos

@Kunal Gaurav , This status code only occurs in one of two conditions:We’re able to request the instances for the cluster but can’t bootstrap them in time We setup the containers on each instance, but can’t start the containers in timethis is an edg...

1 kudos

06-18-2021 4:58:39 PM

by User16790091296 • Contributor II

05-28-2021 12:11:00 PM

991 Views
2 replies
0 kudos

What’s the largest cluster/maximum number of cores you can spin up in the Databricks environment?

Data Engineering

991 Views
2 replies
0 kudos

05-28-2021 12:11:00 PM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-18-2021 1:06:21 AM

0 kudos

Generally it is limited by cloud provider, initially yo get around 350 cores that can be increased by request to cloud vendor, Till now I have seen 1000 cores and it can go much moreIn addition to subscription limits, the total capacity of cluster...

0 kudos

06-18-2021 1:06:21 AM

1 More Replies

by User15787040559 • New Contributor III

06-07-2021 9:17:42 AM

1346 Views
1 replies
0 kudos

What is the maximum number of clusters per workspace in Azure Databricks?

It's governed by Azure subscription limits.

Data Engineering

1346 Views
1 replies
0 kudos

06-07-2021 9:17:42 AM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-17-2021 10:58:46 PM

0 kudos

In addition to subscription limits, the total capacity of clusters in each workspace is a function of the masks used for the workspace's enclosing Vnet and the pair of subnets associated with each cluster in the workspace. The masks can be changed if...

0 kudos

06-17-2021 10:58:46 PM

by User16826994223 • Honored Contributor III

06-17-2021 7:59:52 AM

666 Views
1 replies
0 kudos

Z ordering best practices

What are the best practices around Z ordering, Should be include as Manu column as Possible in Z order or lesser the better and why?

Data Engineering

666 Views
1 replies
0 kudos

06-17-2021 7:59:52 AM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-17-2021 10:01:41 AM

0 kudos

With Z-order and Hilbert curves, the effectiveness of clustering decreases with each column added - so you'd want to zorder only the columns that you's actually use so that it's speed up your workloads.

0 kudos

06-17-2021 10:01:41 AM