cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16869510359
by Esteemed Contributor
  • 867 Views
  • 1 replies
  • 0 kudos
  • 867 Views
  • 1 replies
  • 0 kudos
Latest Reply
aladda
Honored Contributor II
  • 0 kudos

Global: run on every cluster in the workspace. They can help you to enforce consistent cluster configurations across your workspace. Use them carefully because they can cause unanticipated impacts, like library conflicts. Only admin users can create ...

  • 0 kudos
User16783853906
by Contributor III
  • 1518 Views
  • 3 replies
  • 0 kudos

Resolved! Frequent spot loss of driver nodes resulting in failed jobs when using spot fleet pools

When using spot fleet pools to schedule jobs, driver and worker nodes are provisioned from the spot pools and we are noticing jobs failing with the below exception when there is a driver spot loss. Share best practices around using fleet pools with 1...

  • 1518 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16783853906
Contributor III
  • 0 kudos

In this scenario, the driver node is reclaimed by AWS. Databricks started preview of hybrid pools feature which would allow you to provision driver node from a different pool. We recommend using on-demand pool for driver node to improve reliability i...

  • 0 kudos
2 More Replies
User16869510359
by Esteemed Contributor
  • 1338 Views
  • 1 replies
  • 0 kudos

Resolved! How to uninstall libraries that are set to auto-install on all cluster - using REST API

I have a bunch of libraries that I want to uninstall. All of them are marked as auto-install.

  • 1338 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 0 kudos

1) Find the corresponding library definition from an existing cluster using "libraries/cluster-status?cluster_id=<cluster_id>".$ curl -X GET 'https://cust-success.cloud.databricks.com/api/2.0/libraries/cluster-status?cluster_id=1226-232931-cuffs129' ...

  • 0 kudos
User16869510359
by Esteemed Contributor
  • 1750 Views
  • 1 replies
  • 0 kudos

Resolved! I can't find my cluster

I had a cluster that I used in the past. I do not see the cluster any longer. I checked with the admin and my team and everyone confirmed that there no user deletion. 

  • 1750 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 0 kudos

If the cluster is unsued for 30 days, Databricks removes the cluster. This is a general clean-up policy. It's possible to whitelist a cluster from this clean-up by Pinning the cluster. https://docs.databricks.com/clusters/clusters-manage.html#pin-a-c...

  • 0 kudos
User16869510359
by Esteemed Contributor
  • 1292 Views
  • 1 replies
  • 0 kudos

Resolved! How to restart the cluster with new instances?

Whenever I restart a Databricks cluster new instances are not launched. This is because Databricks re-uses the instances. However, sometimes it's needed to launch new instances. Some scenarios are to mitigate a bad VM issue or maybe to get a patch fr...

  • 1292 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 0 kudos

Currently, there is no direct option to restart the cluster with new instances. An easy hack to ensure new instances are launched is to add Cluster tags on your cluster. This will ensure that new instances have to be acquired as it's not possible to ...

  • 0 kudos
User16869510359
by Esteemed Contributor
  • 1022 Views
  • 1 replies
  • 1 kudos

Resolved! Cluster logs missing

On the Databricks cluster UI, when I click on the Driver logs, sometimes I see historic logs and sometimes I see logs for the last few hours. Why do we see this inconsistency

  • 1022 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 1 kudos

This is working per design! This is the expected behavior. When the cluster is in terminated state, the logs are serviced by the Spark History server hosted on the Databricks control plane. When the cluster is up and running the logs are serviced by ...

  • 1 kudos
User16869510359
by Esteemed Contributor
  • 1053 Views
  • 1 replies
  • 0 kudos

Resolved! I do not have any Spark jobs running, but my cluster is not getting auto-terminated.

The cluster is Idle and there are no Spark jobs running on the Spark UI. Still I see my cluster is active and not getting terminated.

  • 1053 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 0 kudos

Databricks cluster is treated as active if there are any spark or non-Spark operations running on the cluster. Even though there are no Spark jobs running on the cluster, it's possible to have some driver-specific application code running marking th...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 2678 Views
  • 1 replies
  • 1 kudos

Resolved! cluster start Issues

Some of the Jobs are failing in prod with below error message:Can you please check and let us know the reason for this? These are running under pool cluster.Run result unavailable: job failed with error messageUnexpected failure while waiting for the...

  • 2678 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 1 kudos

@Kunal Gaurav​ , This status code only occurs in one of two conditions:We’re able to request the instances for the cluster but can’t bootstrap them in time We setup the containers on each instance, but can’t start the containers in timethis is an edg...

  • 1 kudos
User16790091296
by Contributor II
  • 991 Views
  • 2 replies
  • 0 kudos
  • 991 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Generally it is limited by cloud provider, initially yo get around 350 cores that can be increased by request to cloud vendor, Till now I have seen 1000 cores and it can go much moreIn addition to subscription limits, the total capacity of cluster...

  • 0 kudos
1 More Replies
User15787040559
by New Contributor III
  • 1346 Views
  • 1 replies
  • 0 kudos
  • 1346 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

In addition to subscription limits, the total capacity of clusters in each workspace is a function of the masks used for the workspace's enclosing Vnet and the pair of subnets associated with each cluster in the workspace. The masks can be changed if...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 666 Views
  • 1 replies
  • 0 kudos

Z ordering best practices

What are the best practices around Z ordering, Should be include as Manu column as Possible in Z order or lesser the better and why?

  • 666 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

With Z-order and Hilbert curves, the effectiveness of clustering decreases with each column added - so you'd want to zorder only the columns that you's actually use so that it's speed up your workloads.

  • 0 kudos
Labels