Data Engineering

Forum Posts

Sorted by:

by brickster_2018 • Databricks Employee

06-23-2021 11:32:56 PM

2525 Views
2 replies
0 kudos

Resolved! I do not have any Spark jobs running, but my cluster is not getting auto-terminated.

The cluster is Idle and there are no Spark jobs running on the Spark UI. Still I see my cluster is active and not getting terminated.

Data Engineering

2525 Views
2 replies
0 kudos

06-23-2021 11:32:56 PM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-23-2021 11:45:13 PM

0 kudos

Databricks cluster is treated as active if there are any spark or non-Spark operations running on the cluster. Even though there are no Spark jobs running on the cluster, it's possible to have some driver-specific application code running marking th...

0 kudos

06-23-2021 11:45:13 PM

1 More Replies

by pokus • New Contributor III

03-21-2023 2:23:27 AM

9637 Views
3 replies
2 kudos

Resolved! use DeltaLog class in databricks cluster

I need to use DeltaLog class in the code to get the AddFiles dataset. I have to keep the implemented code in a repo and run it in databricks cluster. Some docs say to use org.apache.spark.sql.delta.DeltaLog class, but it seems databricks gets rid of ...

Data Engineering

9637 Views
3 replies
2 kudos

03-21-2023 2:23:27 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

10-04-2025 12:41:46 AM

2 kudos

Hi @pokus , You don't need to access via reflection. You can Access DeltaLog with spark._jvm:Unity Catalog and DeltaLake tables expose their metadata and transaction log via the JVM backend. Using spark._jvm, you can interact with DeltaLog Thanks!

2 kudos

10-04-2025 12:41:46 AM

2 More Replies

by boskicl • New Contributor III

03-23-2022 11:04:23 AM

38294 Views
8 replies
12 kudos

Resolved! Table write command stuck "Filtering files for query."

Hello all,Background:I am having an issue today with databricks using pyspark-sql and writing a delta table. The dataframe is made by doing an inner join between two tables and that is the table which I am trying to write to a delta table. The table ...

Data Engineering

38294 Views
8 replies
12 kudos

03-23-2022 11:04:23 AM

View Replies

Latest Reply

nvashisth
New Contributor III

08-13-2025 9:40:21 AM

12 kudos

@timo199 , @boskicl I had similar issue and job was getting stuck at Filtering Files for Query indefinitely. I checked SPARK logs and based on that figured out that we had enabled PHOTON acceleration on our cluster for job and datatype of our columns...

12 kudos

08-13-2025 9:40:21 AM

7 More Replies

by AyushModi038 • New Contributor III

02-17-2023 6:26:26 AM

13088 Views
8 replies
10 kudos

Library installation in cluster taking a long time

I am trying to install "pycaret" libraray in cluster using whl file.But it is creating conflict in the dependency sometimes (not always, sometimes it works too.) My questions are -1 - How to install libraries in cluster only single time (Maybe from ...

Data Engineering

13088 Views
8 replies
10 kudos

02-17-2023 6:26:26 AM

View Replies

Latest Reply

Spencer_Kent
New Contributor III

07-05-2024 10:22:18 AM

10 kudos

@Retired_modWhat about question #1, which is what subsequent comments to this thread have been referring to? To recap the question: is it possible for "cluster-installed" libraries to be cached in such a way that they aren't completely reinstalled ev...

10 kudos

07-05-2024 10:22:18 AM

7 More Replies

by elikvar • New Contributor III

03-13-2023 10:48:20 AM

25743 Views
9 replies
9 kudos

Cluster occasionally fails to launch

I have a daily running notebook that occasionally fails with the error:"Run result unavailable: job failed with error message Unexpected failure while waiting for the cluster Some((xxxxxxxxxxxxxxx) )to be readySome(: Cluster xxxxxxxxxxxxxxxx is in un...

Data Engineering

25743 Views
9 replies
9 kudos

03-13-2023 10:48:20 AM

View Replies

Latest Reply

Pavan578
New Contributor II

10-29-2024 6:02:04 AM

9 kudos

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.Can Anyone help me how can we resolve thi...

9 kudos

10-29-2024 6:02:04 AM

8 More Replies

by Jana • New Contributor III

02-15-2022 9:26:54 AM

10752 Views
9 replies
4 kudos

Resolved! Parsing 5 GB json file is running long on cluster

I was creating delta table from ADLS json input file. but the job was running long while creating delta table from json. Below is my cluster configuration. Is the issue related to cluster config ? Do I need to upgrade the cluster config ?The cluster ...

Data Engineering

10752 Views
9 replies
4 kudos

02-15-2022 9:26:54 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

03-01-2022 12:48:29 AM

4 kudos

with multiline = true, the json is read as a whole and processed as such.I'd try with a beefier cluster.

4 kudos

03-01-2022 12:48:29 AM

8 More Replies

by Braxx • Contributor II

09-15-2022 7:09:03 AM

11628 Views
4 replies
3 kudos

Resolved! cluster creation - access mode option

I am a bit lazy and trying to manually recreate a cluster I have in one workspace into another one. The cluster was created some time ago. Looking at the configuration, the access mode field is "custom": When trying to create a new cluster, I do not...

Data Engineering

11628 Views
4 replies
3 kudos

09-15-2022 7:09:03 AM

View Replies

Latest Reply

khushboo20
New Contributor II

10-01-2024 1:37:39 AM

3 kudos

Hi All - I am new to databricks and trying to create my first workflow. For some reason, the cluster created is of type -"custom". I have not mentioned it anywhere in my asset bundle.Due to this - I cannot create get the Unity Catalog feature. Could ...

3 kudos

10-01-2024 1:37:39 AM

3 More Replies

by ckwan48 • New Contributor III

02-03-2022 1:38:15 PM

24686 Views
6 replies
3 kudos

Resolved! How to prevent my cluster to shut down after inactivity

Currently, I am running a cluster that is set to terminate after 60 minutes of inactivity. However, in one of my notebooks, one of the cells is still running. How can I prevent this from happening, if want my notebook to run overnight without monito...

Data Engineering

24686 Views
6 replies
3 kudos

02-03-2022 1:38:15 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

02-10-2022 6:13:10 AM

3 kudos

If a cell is already running ( I assume it's a streaming operation), then I think it doesn't mean that the cluster is inactive. The cluster should be running if a cell is running on it.On the other hand, if you want to keep running your clusters for ...

3 kudos

02-10-2022 6:13:10 AM

5 More Replies

by User16826987838 • Databricks Employee

06-23-2021 1:58:06 PM

5482 Views
2 replies
0 kudos

Is it possible to change the Cluster Creator/Owner of a cluster after it has been created?

Data Engineering

5482 Views
2 replies
0 kudos

06-23-2021 1:58:06 PM

View Replies

Latest Reply

VivekChandran
New Contributor II

08-15-2024 2:12:49 AM

0 kudos

Yes! Cluster's owner/creator can be changed with the REST API - POST /api/2.1/clusters/change-ownerRequest Body sample:{ "cluster_id": "string", "owner_username": "string" }Ref: Clusters API | Change cluster ownerHope this helps!

0 kudos

08-15-2024 2:12:49 AM

1 More Replies

by ptambe • New Contributor III

12-17-2021 12:16:13 AM

6666 Views
6 replies
3 kudos

Resolved! Is Concurrent Writes from multiple databricks clusters to same delta table on S3 Supported?

Does databricks have support for writing to same Delta Table from multiple clusters concurrently. I am specifically interested to know if there is any solution for https://github.com/delta-io/delta/issues/41 implemented in databricks OR if you have a...

Data Engineering

6666 Views
6 replies
3 kudos

12-17-2021 12:16:13 AM

View Replies

Latest Reply

dennyglee
Databricks Employee

12-20-2021 8:57:33 AM

3 kudos

Please note, the issue noted above [Storage System] Support for AWS S3 (multiple clusters/drivers/JVMs) is for Delta Lake OSS. As noted in this issue as well as Issue 324, as of this writing, S3 lacks putIfAbsent transactional consistency. For Del...

3 kudos

12-20-2021 8:57:33 AM

5 More Replies

by gazzyjuruj • Contributor II

07-27-2022 10:44:15 PM

16263 Views
5 replies
12 kudos

Cluster start is currently disabled ?

Hi, i'm trying to run the notebooks but it doesn't do any activity.I had to create a cluster in order to start my code.pressing the play button inside of notebook does nothing at all.and the 'compute' , pressing play there on the clusters gives the e...

Data Engineering

16263 Views
5 replies
12 kudos

07-27-2022 10:44:15 PM

View Replies

Latest Reply

mrp12
New Contributor III

07-25-2024 9:46:19 AM

12 kudos

This is very common issue I see with community edition. I suppose the only work around is to create new cluster each time. More info on stackoverflow:https://stackoverflow.com/questions/69072694/databricks-community-edition-cluster-wont-start

12 kudos

07-25-2024 9:46:19 AM

4 More Replies

by gaurav_khanna • New Contributor II

03-12-2022 5:54:13 AM

9152 Views
4 replies
3 kudos

Resolved! Notebook is not attaching to a cluster, asks to contact your administrator. Completely stumped. Please help.

Data Engineering

9152 Views
4 replies
3 kudos

03-12-2022 5:54:13 AM

View Replies

Latest Reply

BartRJD
New Contributor II

07-11-2024 12:32:19 PM

3 kudos

I am having the same issue (Azure Databricks).I have a running compute cluster analytics-compute-cluster running in Single User access mode. The Event Log for the cluster says the cluster is running and the "Driver is healthy".I have Manage permissi...

3 kudos

07-11-2024 12:32:19 PM

3 More Replies

by Spencer_Kent • New Contributor III

06-07-2023 4:57:19 PM

21629 Views
10 replies
7 kudos

Shared cluster configuration that permits `dbutils.fs` commands

My workspace has a couple different types of clusters, and I'm having issues using the `dbutils` filesystem utilities when connected to a shared cluster. I'm hoping you can help me fix the configuration of the shared cluster so that I can actually us...

insufficient_permissions_on_shared_cluster

Data Engineering

21629 Views
10 replies
7 kudos

06-07-2023 4:57:19 PM

View Replies

Latest Reply

jacovangelder
Databricks MVP

07-02-2024 10:44:45 PM

7 kudos

Can you not use a No Isolation Shared cluster with Table access controls enabled on workspace level?

7 kudos

07-02-2024 10:44:45 PM

9 More Replies

by daindana • New Contributor III

10-13-2021 5:58:51 PM

8050 Views
8 replies
3 kudos

Resolved! How to preserve my database when the cluster is terminated?

Whenever my cluster is terminated, I lose my whole database(I'm not sure if it's related, I made those database with delta format. ) And since the cluster is terminated in 2 hours from not using it, I wake up with no database every morning.I don't wa...

Data Engineering

8050 Views
8 replies
3 kudos

10-13-2021 5:58:51 PM

View Replies

Latest Reply

dhpaulino
New Contributor II

06-03-2024 11:25:19 AM

3 kudos

As the file still in the dbfs you can just recreate the reference of your tables and continue the work, with something like this:db_name = "mydb" from pathlib import Path path_db = f"dbfs:/user/hive/warehouse/{db_name}.db/" tables_dirs = dbutils.fs....

3 kudos

06-03-2024 11:25:19 AM

7 More Replies

by TheDataDexter • New Contributor III

01-07-2022 1:39:04 AM

6548 Views
3 replies
3 kudos

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource. When running notebooks on a single node that read, transform, and write data we do not encounter any probl...

Data Engineering

6548 Views
3 replies
3 kudos

01-07-2022 1:39:04 AM

View Replies

Latest Reply

ellafj
New Contributor II

05-07-2024 11:46:42 PM

3 kudos

@TheDataDexter Did you find a solution to your problem? I am facing the same issue

3 kudos

05-07-2024 11:46:42 PM

2 More Replies

Databricks Community

Resolved! I do not have any Spark jobs running, but my cluster is not getting auto-terminated.

Resolved! use DeltaLog class in databricks cluster

Resolved! Table write command stuck "Filtering files for query."

Library installation in cluster taking a long time

Cluster occasionally fails to launch

Resolved! Parsing 5 GB json file is running long on cluster

Resolved! cluster creation - access mode option

Resolved! How to prevent my cluster to shut down after inactivity

Is it possible to change the Cluster Creator/Owner of a cluster after it has been created?

Resolved! Is Concurrent Writes from multiple databricks clusters to same delta table on S3 Supported?

Cluster start is currently disabled ?

Resolved! Notebook is not attaching to a cluster, asks to contact your administrator. Completely stumped. Please help.

Shared cluster configuration that permits `dbutils.fs` commands

Resolved! How to preserve my database when the cluster is terminated?

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.