Data Engineering

Forum Posts

Sorted by:

by Ganeshch • New Contributor III

06-26-2025 12:52:05 PM

1538 Views
3 replies
0 kudos

No option to create cluster

I don't see any option to create cluster inside compute in community edition, is it disable? .How to create cluster ? Please help me

Data Engineering

1538 Views
3 replies
0 kudos

06-26-2025 12:52:05 PM

View Replies

Latest Reply

Ganeshch
New Contributor III

06-26-2025 1:29:17 PM

0 kudos

If i create notebook and run it , explicitly cluster will not be created but it will work in the backend , am i right?

0 kudos

06-26-2025 1:29:17 PM

2 More Replies

by ghilage • New Contributor III

06-25-2025 4:14:23 AM

1344 Views
4 replies
0 kudos

Not able to write to dbfs from workflow

Hi All,I am facing below issue while writing to dbfs.I have a pyspark code in which I am writing a dataframe to dbfs using below code :dbfs_path.mkdir(parents=True, exist_ok=True)my_df.write.format("parquet").mode("overwrite").save(f"{dbfs_path}/my_d...

Data Engineering

1344 Views
4 replies
0 kudos

06-25-2025 4:14:23 AM

View Replies

Latest Reply

ghilage
New Contributor III

06-26-2025 12:11:00 PM

0 kudos

looks like some problem withing my dataframe itself.If i skip some of the expensive field calculations then it is able to write to dbfs.

0 kudos

06-26-2025 12:11:00 PM

3 More Replies

by Davila • New Contributor II

06-25-2025 2:42:53 PM

1669 Views
2 replies
2 kudos

Resolved! Asset Bundle Validation Not Completing – Stuck on files_to_sync

I have a Databricks asset bundle with the following structure:bundle: name: <some value here> uuid: <some value here> include: - resources/*.yml variables: catalog_bronze: {} catalog_silver: {} user_name: {} targets: dev: mode: ...

Data Engineering

1669 Views
2 replies
2 kudos

06-25-2025 2:42:53 PM

View Replies

Latest Reply

Renu_
Valued Contributor II

06-26-2025 7:13:24 AM

2 kudos

Hi @Davila, Validation can be slow if your bundle root includes a large number of files. However, since your bundle contains only a few files, the delay may be due to the root_path pointing to a broader directory structure in the Databricks workspace...

2 kudos

06-26-2025 7:13:24 AM

1 More Replies

by cloudengineer • New Contributor

06-25-2025 9:52:49 AM

1016 Views
2 replies
0 kudos

No cluster create capabilities on my databricks UI, just SQL Warehouse. Complete work stoppage.

Data Engineering

1016 Views
2 replies
0 kudos

06-25-2025 9:52:49 AM

View Replies

Latest Reply

MadhuB
Valued Contributor

06-26-2025 8:35:45 AM

0 kudos

@cloudengineer By default workspace admins can create interactive clusters. Non-admin users should be granted access either to a compute policy or should be provisioned access onto a existing clusters. If there is a requirement to enable cluster crea...

0 kudos

06-26-2025 8:35:45 AM

1 More Replies

by ashraf1395 • Honored Contributor

11-29-2024 6:57:02 AM

2540 Views
6 replies
2 kudos

Empty Streaming tables in dlt

I want to create empty streaming tables in dlt with only schema specified. Is it possible ?I want to do it in dlt python.

Data Engineering

2540 Views
6 replies
2 kudos

11-29-2024 6:57:02 AM

View Replies

Latest Reply

brunoillipronti
New Contributor II

06-26-2025 7:12:07 AM

2 kudos

I confirm that ashraf1395 solution works. All approaches I tried (of creating an empty table) created a materialized view (which you can't merge). It's disappointing though, since there is no quick param in a "create_table" command to create a simple...

2 kudos

06-26-2025 7:12:07 AM

5 More Replies

by Zachary_Higgins • Contributor

06-01-2022 1:56:25 PM

15298 Views
9 replies
13 kudos

ignoreDeletes' option with Delta Live Table streaming source

We have a delta streaming source in our delta live table pipelines that may have data deleted from time to time. The error message is pretty self explanatory:...from streaming source at version 191. This is currently not supported. If you'd like to i...

Data Engineering

15298 Views
9 replies
13 kudos

06-01-2022 1:56:25 PM

View Replies

Latest Reply

IanB_Argento
New Contributor II

06-26-2025 3:59:09 AM

13 kudos

I had this same issue whilst doing some POC work. I was able to overcome it as follows:Navigate to Workflows | Jobs & pipelines.Select your pipeline.Click the drop-down next to the Start button.Choose "Full refresh all".That resets it all and fixes t...

13 kudos

06-26-2025 3:59:09 AM

8 More Replies

by Pavankumar7 • New Contributor III

06-23-2025 3:46:08 AM

3304 Views
6 replies
4 kudos

Resolved! Error in connecting serverless compute in free edition

I am unable to connect serverless compute under Free edition of DB, also in compute tab, I can see only the 3 tabs (SQL warehouses, Vector search, apps) not able to create new compute as we used to create in community edition

Data Engineering

3304 Views
6 replies
4 kudos

06-23-2025 3:46:08 AM

View Replies

Latest Reply

Thomas_W
New Contributor III

06-25-2025 9:44:28 PM

4 kudos

@Pavankumar7 - are you experiencing this issue for existing/imported notebooks, or for brand new notebooks too?If it's the former, the notebook may be using an old serverless environment version. When Databricks updates the Serverless environment, ex...

4 kudos

06-25-2025 9:44:28 PM

5 More Replies

by pacman • New Contributor

12-29-2023 10:12:52 AM

17807 Views
7 replies
0 kudos

How to run a saved query from a Notebook (PySpark)

Hi Team! Noob to Databricks, so apologies if I ask a dumb question.I have created a relatively large series of queries that fetch and organize the data I want. I'm ready to drive all of these from a Notebook (likely PySpark).An example query is save...

Data Engineering

17807 Views
7 replies
0 kudos

12-29-2023 10:12:52 AM

View Replies

Latest Reply

aethorimn_cgr
New Contributor II

06-25-2025 4:58:40 PM

0 kudos

@uday_satapathy Hi Uday. Do you know if this method works for many users? In case I need to share the script so a teammate may use it.

0 kudos

06-25-2025 4:58:40 PM

6 More Replies

by Pratikmsbsvm • Contributor

06-25-2025 9:08:02 AM

1893 Views
2 replies
2 kudos

Resolved! Data Lakehouse architecture with Azure Databricks and Unity Catalog

I am Creating a Data lakehouse solution on Azure Databricks.Source : SAP, SALESFORCE, AdobeTarget: Hightouch (External Application), Mad Mobile (External Application)The datalake house also have transactional records which should be store in ACID pro...

Data Engineering

1893 Views
2 replies
2 kudos

06-25-2025 9:08:02 AM

View Replies

Latest Reply

KaranamS
Contributor III

06-25-2025 11:44:08 AM

2 kudos

Hi @Pratikmsbsvm , from what I understand, you have a lakehouse on Azure databricks and would like to share this data with another databricks account or workspace. If Unity Catalog is enabled on your Azure databricks account, you can leverage Delta S...

2 kudos

06-25-2025 11:44:08 AM

1 More Replies

by data_learner1 • New Contributor II

06-18-2025 9:48:01 AM

1320 Views
4 replies
1 kudos

Need to track the schema changes/column renames/column drops in Data bricks Unity Catalog

Hi Team, We are getting data from third party vendor to the databricks unity Catalog. They are doing schema changes frequently and we would like to track that. Just wanted to know if I can do this using audit table on the system catalog. As we only h...

Data Engineering

1320 Views
4 replies
1 kudos

06-18-2025 9:48:01 AM

View Replies

Latest Reply

CURIOUS_DE
Valued Contributor

06-25-2025 11:02:34 AM

1 kudos

@data_learner1 Unity Catalog logs all data access and metadata operations (including schema changes) into the audit logs — which are stored in the system catalog tables, such as:system.access.auditYou mentioned you only have read access — and likely...

1 kudos

06-25-2025 11:02:34 AM

3 More Replies

by NikosLoutas • Databricks Partner

06-25-2025 3:21:52 AM

2678 Views
2 replies
0 kudos

Resolved! Databricks Full Refresh of DLT Pipeline

Hello, I have a question regarding the full refresh of a DLT pipeline, where the data source is an external table. When running the pipeline without a full refresh, then the streaming will pull data which are currently present in the external source ...

Data Engineering

2678 Views
2 replies
0 kudos

06-25-2025 3:21:52 AM

View Replies

Latest Reply

seeyesbee
New Contributor II

06-25-2025 9:58:36 AM

0 kudos

Hi @paolajara — in your point 5 you mentioned using Delta Lake for tracking changes. Could you point me to any official docs or examples that walk through enabling CDC / row-tracking on a Delta table?I pull data from SharePoint via its REST endpoint,...

0 kudos

06-25-2025 9:58:36 AM

1 More Replies

by Pratikmsbsvm • Contributor

06-23-2025 9:39:01 PM

1743 Views
2 replies
0 kudos

How to build architecture for Batch as well Stream Data Pipeline in Databricks

Hello,I am planning to Create a Data Lake house using Azure and Databricks.Earlier i planned to do with Azure, but use cases looks complex.Can someone please help me with suggestions.Source System : SAP, SALESFORCE, SAP CAR, Adobe Clickstream.Consume...

Data Engineering

1743 Views
2 replies
0 kudos

06-23-2025 9:39:01 PM

View Replies

Latest Reply

SP_6721
Honored Contributor II

06-25-2025 5:10:46 AM

0 kudos

Hi @Pratikmsbsvm ,The appropriate approach would be:Data Ingestion:Ingest data from SAP, SAP CAR, and Salesforce using Azure Data Factory or third-party connectors. For near real-time updates, enable CDC-based ingestion.Data Lakehouse Storage:Store a...

0 kudos

06-25-2025 5:10:46 AM

1 More Replies

by guizsantos • New Contributor II

05-21-2024 10:32:58 AM

4269 Views
3 replies
3 kudos

Resolved! How to obtain a query profile programatically?

Hi everyone! Does anyone know if there is a way to obtain the data used to create the graph showed in the "Query profile" section? Particularly, I am interested in the rows produced by the intermediary query operations. I can see there is "Download" ...

Data Engineering

4269 Views
3 replies
3 kudos

05-21-2024 10:32:58 AM

View Replies

Latest Reply

artsheiko
Databricks Employee

06-25-2025 7:30:23 AM

3 kudos

@guizsantos, Query history list api provides metrics, see include_metrics an executed query definition may be seen using query history system table

3 kudos

06-25-2025 7:30:23 AM

2 More Replies

by seefoods • Valued Contributor

06-24-2025 7:55:39 AM

1689 Views
1 replies
1 kudos

Resolved! python task

Hello Guys,I have define asset bundle which have rule to run a python task. This task have some parameters, So how can i interract with this using argparse ? Cordially,

Data Engineering

1689 Views
1 replies
1 kudos

06-24-2025 7:55:39 AM

View Replies

Latest Reply

SP_6721
Honored Contributor II

06-25-2025 7:06:05 AM

1 kudos

Hi @seefoods ,In your asset bundle YAML, define the parameters using the named_parameters field, for example like this:tasks: - task_key: python_task python_wheel_task: entry_point: main named_parameters: input_path: "/data/input...

1 kudos

06-25-2025 7:06:05 AM

by mkwparth • Databricks Partner

06-23-2025 11:51:50 PM

1990 Views
4 replies
1 kudos

Spark Failed to start: Driver unresponsive

Hi everyone,I'm encountering an intermittent issue when launching a Databricks pipeline cluster. Error messagecom.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster xxxx-xxxxxx-ofgxxxxx: Attempt to la...

Data Engineering

1990 Views
4 replies
1 kudos

06-23-2025 11:51:50 PM

View Replies

Latest Reply

Gopichand_G
Databricks Partner

06-25-2025 5:13:17 AM

1 kudos

I have personally witnessed these kind of issues. Why these failures happen, usually as far as I have witnessed that the Driver Node might be unavailable or not responsive as you might have hit the maximum cpu or memory usage, may be your cache utili...

1 kudos

06-25-2025 5:13:17 AM

3 More Replies

Databricks Community

Forum Posts

No option to create cluster

Not able to write to dbfs from workflow

Resolved! Asset Bundle Validation Not Completing – Stuck on files_to_sync

No cluster create capabilities on my databricks UI, just SQL Warehouse. Complete work stoppage.

Empty Streaming tables in dlt

ignoreDeletes' option with Delta Live Table streaming source

Resolved! Error in connecting serverless compute in free edition

How to run a saved query from a Notebook (PySpark)

Resolved! Data Lakehouse architecture with Azure Databricks and Unity Catalog

Need to track the schema changes/column renames/column drops in Data bricks Unity Catalog

Resolved! Databricks Full Refresh of DLT Pipeline

How to build architecture for Batch as well Stream Data Pipeline in Databricks

Resolved! How to obtain a query profile programatically?

Resolved! python task

Spark Failed to start: Driver unresponsive

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template