cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

elikvar
by New Contributor III
  • 18824 Views
  • 9 replies
  • 9 kudos

Cluster occasionally fails to launch

I have a daily running notebook that occasionally fails with the error:"Run result unavailable: job failed with error message Unexpected failure while waiting for the cluster Some((xxxxxxxxxxxxxxx) )to be readySome(: Cluster xxxxxxxxxxxxxxxx is in un...

  • 18824 Views
  • 9 replies
  • 9 kudos
Latest Reply
Pavan578
New Contributor II
  • 9 kudos

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.Can Anyone help me how can we resolve thi...

  • 9 kudos
8 More Replies
Jana
by New Contributor III
  • 7613 Views
  • 9 replies
  • 4 kudos

Resolved! Parsing 5 GB json file is running long on cluster

I was creating delta table from ADLS json input file. but the job was running long while creating delta table from json. Below is my cluster configuration. Is the issue related to cluster config ? Do I need to upgrade the cluster config ?The cluster ...

  • 7613 Views
  • 9 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

with multiline = true, the json is read as a whole and processed as such.I'd try with a beefier cluster.

  • 4 kudos
8 More Replies
Braxx
by Contributor II
  • 6603 Views
  • 4 replies
  • 3 kudos

Resolved! cluster creation - access mode option

I am a bit lazy and trying to manually recreate a cluster I have in one workspace into another one. The cluster was created some time ago. Looking at the configuration, the access mode field is "custom": When trying to create a new cluster, I do not...

Captureaa Capturebb
  • 6603 Views
  • 4 replies
  • 3 kudos
Latest Reply
khushboo20
New Contributor II
  • 3 kudos

Hi All - I am new to databricks and trying to create my first workflow. For some reason, the cluster created is of type -"custom". I have not mentioned it anywhere in my asset bundle.Due to this - I cannot create get the Unity Catalog feature. Could ...

  • 3 kudos
3 More Replies
ckwan48
by New Contributor III
  • 16517 Views
  • 6 replies
  • 3 kudos

Resolved! How to prevent my cluster to shut down after inactivity

Currently, I am running a cluster that is set to terminate after 60 minutes of inactivity. However, in one of my notebooks, one of the cells is still running. How can I prevent this from happening, if want my notebook to run overnight without monito...

  • 16517 Views
  • 6 replies
  • 3 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 3 kudos

If a cell is already running ( I assume it's a streaming operation), then I think it doesn't mean that the cluster is inactive. The cluster should be running if a cell is running on it.On the other hand, if you want to keep running your clusters for ...

  • 3 kudos
5 More Replies
User16826987838
by Contributor
  • 3069 Views
  • 2 replies
  • 0 kudos
  • 3069 Views
  • 2 replies
  • 0 kudos
Latest Reply
VivekChandran
New Contributor II
  • 0 kudos

Yes! Cluster's owner/creator can be changed with the REST API - POST /api/2.1/clusters/change-ownerRequest Body sample:{ "cluster_id": "string", "owner_username": "string" }Ref: Clusters API | Change cluster ownerHope this helps!

  • 0 kudos
1 More Replies
ptambe
by New Contributor III
  • 4712 Views
  • 6 replies
  • 3 kudos

Resolved! Is Concurrent Writes from multiple databricks clusters to same delta table on S3 Supported?

Does databricks have support for writing to same Delta Table from multiple clusters concurrently. I am specifically interested to know if there is any solution for https://github.com/delta-io/delta/issues/41 implemented in databricks OR if you have a...

  • 4712 Views
  • 6 replies
  • 3 kudos
Latest Reply
dennyglee
Databricks Employee
  • 3 kudos

Please note, the issue noted above [Storage System] Support for AWS S3 (multiple clusters/drivers/JVMs) is for Delta Lake OSS. As noted in this issue as well as Issue 324, as of this writing, S3 lacks putIfAbsent transactional consistency. For Del...

  • 3 kudos
5 More Replies
gazzyjuruj
by Contributor II
  • 10082 Views
  • 5 replies
  • 10 kudos

Cluster start is currently disabled ?

Hi, i'm trying to run the notebooks but it doesn't do any activity.I had to create a cluster in order to start my code.pressing the play button inside of notebook does nothing at all.and the 'compute' , pressing play there on the clusters gives the e...

  • 10082 Views
  • 5 replies
  • 10 kudos
Latest Reply
mrp12
New Contributor II
  • 10 kudos

This is very common issue I see with community edition. I suppose the only work around is to create new cluster each time. More info on stackoverflow:https://stackoverflow.com/questions/69072694/databricks-community-edition-cluster-wont-start

  • 10 kudos
4 More Replies
gaurav_khanna
by New Contributor II
  • 5082 Views
  • 4 replies
  • 3 kudos
  • 5082 Views
  • 4 replies
  • 3 kudos
Latest Reply
BartRJD
New Contributor II
  • 3 kudos

I am having the same issue (Azure Databricks).I have a running compute cluster analytics-compute-cluster running in Single User access mode.  The Event Log for the cluster says the cluster is running and the "Driver is healthy".I have Manage permissi...

  • 3 kudos
3 More Replies
AyushModi038
by New Contributor III
  • 6586 Views
  • 7 replies
  • 7 kudos

Library installation in cluster taking a long time

I am trying to install "pycaret" libraray in cluster using whl file.But it is creating conflict in the dependency sometimes (not always, sometimes it works too.) ​My questions are -1 - How to install libraries in cluster only single time (Maybe from ...

  • 6586 Views
  • 7 replies
  • 7 kudos
Latest Reply
Spencer_Kent
New Contributor III
  • 7 kudos

@Retired_modWhat about question #1, which is what subsequent comments to this thread have been referring to? To recap the question: is it possible for "cluster-installed" libraries to be cached in such a way that they aren't completely reinstalled ev...

  • 7 kudos
6 More Replies
Spencer_Kent
by New Contributor III
  • 12358 Views
  • 10 replies
  • 3 kudos

Shared cluster configuration that permits `dbutils.fs` commands

My workspace has a couple different types of clusters, and I'm having issues using the `dbutils` filesystem utilities when connected to a shared cluster. I'm hoping you can help me fix the configuration of the shared cluster so that I can actually us...

insufficient_permissions_on_shared_cluster shared_cluster_config individual_use_cluster
  • 12358 Views
  • 10 replies
  • 3 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 3 kudos

Can you not use a No Isolation Shared cluster with Table access controls enabled on workspace level? 

  • 3 kudos
9 More Replies
daindana
by New Contributor III
  • 5357 Views
  • 8 replies
  • 3 kudos

Resolved! How to preserve my database when the cluster is terminated?

Whenever my cluster is terminated, I lose my whole database(I'm not sure if it's related, I made those database with delta format. ) And since the cluster is terminated in 2 hours from not using it, I wake up with no database every morning.I don't wa...

  • 5357 Views
  • 8 replies
  • 3 kudos
Latest Reply
dhpaulino
New Contributor II
  • 3 kudos

 As the file still in the dbfs you can just recreate the reference of your tables and continue the work, with something like this:db_name = "mydb" from pathlib import Path path_db = f"dbfs:/user/hive/warehouse/{db_name}.db/" tables_dirs = dbutils.fs....

  • 3 kudos
7 More Replies
TheDataDexter
by New Contributor III
  • 3960 Views
  • 3 replies
  • 3 kudos

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource. When running notebooks on a single node that read, transform, and write data we do not encounter any probl...

  • 3960 Views
  • 3 replies
  • 3 kudos
Latest Reply
ellafj
New Contributor II
  • 3 kudos

@TheDataDexter Did you find a solution to your problem? I am facing the same issue

  • 3 kudos
2 More Replies
bamhn
by New Contributor II
  • 5248 Views
  • 3 replies
  • 2 kudos

My cluster can't access any tables in data catalogs

My goal is to have table access control in the data science and engineering workspace. So I enabled access control to my cluster using this config "spark.databricks.acl.dfAclsEnabled": "true" and my cluster is shown as Table ACLs enabled now (shield ...

image.png image
  • 5248 Views
  • 3 replies
  • 2 kudos
Latest Reply
Karthik_Venu
New Contributor II
  • 2 kudos

Here is my use case: https://community.databricks.com/t5/data-engineering/structured-streaming-using-delta-as-source-and-delta-as-sink-and/td-p/67825And I get this error: "py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.Datase...

  • 2 kudos
2 More Replies
Jon
by New Contributor II
  • 3495 Views
  • 4 replies
  • 5 kudos

IP address fix

How can I fix the IP address of my Azure Cluster so that I can whitelist the IP address to run my job daily on my python notebook? Or can I find out the IP address to perform whitelisting? Thanks

  • 3495 Views
  • 4 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Depends on the scenario.  You could expose a single ip address to the external internet, but databricks itself will always use many addresses.

  • 5 kudos
3 More Replies
pokus
by New Contributor III
  • 7038 Views
  • 2 replies
  • 2 kudos

Resolved! use DeltaLog class in databricks cluster

I need to use DeltaLog class in the code to get the AddFiles dataset. I have to keep the implemented code in a repo and run it in databricks cluster. Some docs say to use org.apache.spark.sql.delta.DeltaLog class, but it seems databricks gets rid of ...

  • 7038 Views
  • 2 replies
  • 2 kudos
Latest Reply
dbal
New Contributor III
  • 2 kudos

Thanks for providing a solution @pokus .What I dont understand is why Databricks cannot provide the DeltaLog at runtime. How can this be the official solution? We need a better solution for this instead of depending on reflections.

  • 2 kudos
1 More Replies
Labels