Data Engineering

Forum Posts

Sorted by:

by Gary_Irick • Databricks Partner

09-13-2022 11:20:21 AM

16150 Views
10 replies
10 kudos

Delta table partition directories when column mapping is enabled

I recently created a table on a cluster in Azure running Databricks Runtime 11.1. The table is partitioned by a "date" column. I enabled column mapping, like this:ALTER TABLE {schema}.{table_name} SET TBLPROPERTIES('delta.columnMapping.mode' = 'nam...

Data Engineering

16150 Views
10 replies
10 kudos

09-13-2022 11:20:21 AM

View Replies

Latest Reply

Narsikakunuri
New Contributor II

01-19-2026 8:59:05 AM

10 kudos

Still same behaviour when Column Mapping enabled

10 kudos

01-19-2026 8:59:05 AM

9 More Replies

by Balazs • New Contributor III

11-24-2023 6:05:22 AM

13719 Views
4 replies
10 kudos

Unity Catalog Volume as spark checkpoint location

Hi,I tried to set the spark checkpoint location in a notebook to a folder in a Unity Catalog Volume, with the following command: sc.setCheckpointDir("/Volumes/catalog_name/schema_name/volume_name/folder_name")Unfortunately I receive the following err...

Data Engineering

13719 Views
4 replies
10 kudos

11-24-2023 6:05:22 AM

View Replies

Latest Reply

aaonurdemir
New Contributor II

01-19-2026 8:17:30 AM

10 kudos

Any progress on this? https://docs.databricks.com/aws/en/notebooks/source/graphframes-user-guide-py.htmlthis is not working both with checkpointing and standard graph algorithms

10 kudos

01-19-2026 8:17:30 AM

3 More Replies

by Naveenkumar1811 • New Contributor III

11-25-2025 10:43:30 AM

730 Views
6 replies
2 kudos

Reduce the Time for First Spark Streaming Run Kick off

Hi Team,Currently I have a Silver Delta Table(External) is loading on Streaming and the Gold is on Batch.I Need to Make the Gold Delta as well to Streaming. In My First Run I can the stream initializing process is taking an hour or so as my Silver ta...

Data Engineering

730 Views
6 replies
2 kudos

11-25-2025 10:43:30 AM

View Replies

Latest Reply

Naveenkumar1811
New Contributor III

01-19-2026 6:11:22 AM

2 kudos

Yes, I understand on the optimize and Vaccum...But still the silver table is very heavy. It is definitely going to take long.Any other suggestions in prod scenario where we can perform this without data loss?

2 kudos

01-19-2026 6:11:22 AM

5 More Replies

by ChrisLawford_n1 • Contributor II

01-15-2026 4:01:42 PM

446 Views
2 replies
2 kudos

Resolved! Bug Report: SDP (DLT) with autoloader not passing through pipe delimiter/separator

I am noticing a difference between using autoloader in an interactive notebook vs using it in a Spark Declarative Pipeline (DLT Pipeline). This issue seems to be very similar to this other unanswered post from a few years ago. Bug report: the delimit...

Data Engineering

446 Views
2 replies
2 kudos

01-15-2026 4:01:42 PM

View Replies

Latest Reply

ChrisLawford_n1
Contributor II

01-19-2026 1:23:52 AM

2 kudos

Hey,Okay thanks @nikhilj0421. I have now solved the issue but not with a full refresh of the table. I had tried this previously and even deleted the DLT pipeline hoping that would provide me the clean slate if this lingering schema was an issue but w...

2 kudos

01-19-2026 1:23:52 AM

1 More Replies

by matmad • New Contributor III

08-06-2025 8:01:19 AM

1415 Views
5 replies
2 kudos

Resolved! Job fails on clusters only with library dependency

Hello!I have following problem: All my job runs fail when the job uses a library. Even the most basic job (print a string) and the most basic library package (no secondary dependencies, the script does not even import/use the library) fails with `Fai...

Data Engineering

1415 Views
5 replies
2 kudos

08-06-2025 8:01:19 AM

View Replies

Latest Reply

gopal2026
New Contributor II

01-18-2026 11:45:40 AM

2 kudos

Hi, can you please share detailed soluton, did you include any config in databricks.yml? I'm also having same issue.

2 kudos

01-18-2026 11:45:40 AM

4 More Replies

by pop_smoke • New Contributor III

09-06-2025 6:35:52 AM

1274 Views
4 replies
5 kudos

Resolved! switching to Databrick from Ab Initio (an old ETL software)- NEED ADVICE

All courses in market and on youtube as per my knowledge for databrick is outdated as those courses are for community edition. there is no new course for free edition of databrick. i am a working profession and i do not get much time. do you guys kno...

Data Engineering

1274 Views
4 replies
5 kudos

09-06-2025 6:35:52 AM

View Replies

Latest Reply

markjvickers-im
Databricks Partner

01-17-2026 1:05:00 PM

5 kudos

@pop_smoke What were the arguments that swayed your organization to swtich to Databricks from Ab Initiio?Pure cost basis?

5 kudos

01-17-2026 1:05:00 PM

3 More Replies

by Ved88 • Databricks Partner

01-17-2026 10:08:28 AM

502 Views
1 replies
1 kudos

Resolved! al purpose databricks cluster disappear

Hi, i can see sometime the cluster get disappeared even though it was created some time back using cluster pipeline,what could be the reason to disappear.we can recreate cluster but wanted to know the reason why this cluster get disappeared.thanks!Ve...

Data Engineering

502 Views
1 replies
1 kudos

01-17-2026 10:08:28 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

01-17-2026 11:10:43 AM

1 kudos

Hi @Ved88 ,30 days after a compute is terminated, it is permanently deleted. To keep an all-purpose compute configuration after a compute has been terminated for more than 30 days, an administrator can pin the compute. Up to 100 compute resources can...

1 kudos

01-17-2026 11:10:43 AM

by smpa01 • Contributor

01-15-2026 8:58:54 PM

531 Views
2 replies
2 kudos

Resolved! Python DataSource API utilities/ Import Fails in Spark Declarative Pipeline

TLDR - UDFs work fine when imported from `utilities/` folder in DLT pipelines, but custom Python DataSource APIs fail with ModuleNotFoundError: No module named 'utilities'` during serialization. Only inline definitions work. Need reusable DataSource ...

Data Engineering

531 Views
2 replies
2 kudos

01-15-2026 8:58:54 PM

View Replies

Latest Reply

smpa01
Contributor

01-16-2026 2:23:36 PM

2 kudos

@emma_s Thank you for the guidance! The wheel package approach worked perfectly.I also tried putting the .py directly in but it did not work/Workspace/Libraries/custom_datasource.py

2 kudos

01-16-2026 2:23:36 PM

1 More Replies

by ChristianRRL • Honored Contributor

01-16-2026 7:13:22 AM

594 Views
3 replies
5 kudos

Resolved! Is Auto Loader open source now in Apache 4.1 SDP?

With Spark Declarative Pipelines (SDP) being open source now, does this mean that the Databricks Auto Loader functionality is also open source? Is it called something else? If not, how does the open-source version handle incremental data processing a...

Data Engineering

594 Views
3 replies
5 kudos

01-16-2026 7:13:22 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

01-16-2026 9:11:20 AM

5 kudos

Hi @ChristianRRL ,No, autoloader is propriety to Databricks. It's not open sourced. Open source version of SDP uses spark structured streaming for incremental processing. Keep in mind that Auto Loader is basically just Spark streaming under the hood ...

5 kudos

01-16-2026 9:11:20 AM

2 More Replies

by Bkr-dbricks • New Contributor II

01-16-2026 8:30:29 AM

429 Views
1 replies
0 kudos

Resolved! Databricks free Edition to Azure Connectivity

Hello EveryoneAs a beginner in databricks, I have a question. Can we connect Databricks Free Edition to connect Azure Blob/ Gen 2 storage? I would like to create external tables on files on Azure and Delta lake tables on top of it.Your help is apprec...

Data Engineering

429 Views
1 replies
0 kudos

01-16-2026 8:30:29 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

01-16-2026 9:19:12 AM

0 kudos

Hi @Bkr-dbricks ,According to following topic Free Edition doesn't support external locations.Solved: If use databricks free version not free trail can ... - Databricks Community - 127421

0 kudos

01-16-2026 9:19:12 AM

by kivaniutenko • New Contributor

02-17-2025 12:45:21 AM

732 Views
1 replies
1 kudos

HTML Formatting Issue in Databricks Alerts

Hello everyone,I have recently encountered an issue with HTML formatting in custom templates for Databricks Alerts. Previously, the formatting worked correctly, but now the alerts display raw HTML instead of properly rendered content.For example, an ...

Data Engineering

732 Views
1 replies
1 kudos

02-17-2025 12:45:21 AM

View Replies

Latest Reply

mmayorga
Databricks Employee

01-16-2026 7:38:06 AM

1 kudos

hi @kivaniutenko thanks for reaching out. Databricks alerts still support basic HTML in email templates, but HTML will render correctly only for email destinations and only with simple, allowed tags. Quick things to try Make sure you are using Ale...

1 kudos

01-16-2026 7:38:06 AM

by SparkMan • Databricks Partner

01-16-2026 3:16:19 AM

705 Views
2 replies
2 kudos

Resolved! Job Cluster Reuse

Hi, I have a job where a job cluster is reused twice for task A and task C. Between A and C, task B runs for 4 hours on a different interactive cluster. The issue here is that the job cluster doesn't terminate as soon as Task A is completed and sits ...

Data Engineering

705 Views
2 replies
2 kudos

01-16-2026 3:16:19 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

01-16-2026 4:12:07 AM

2 kudos

Hi @SparkMan ,This is expected behavior with Databricks job cluster reuse unless you change your job/task configuration. Look at following documentation entry:So with your flow you have something like this:Task A (job cluster) → Task B (interactive c...

2 kudos

01-16-2026 4:12:07 AM

1 More Replies

by nkrish • New Contributor II

01-16-2026 2:49:25 AM

472 Views
1 replies
1 kudos

Resolved! Regarding Accelerators

Are there any databricks accelerators to convert the c# and qlikview code to pyspark ? We are using the Open source AI tools to convert now but wondering is there any better way to do the same?Thanks in advance

Data Engineering

472 Views
1 replies
1 kudos

01-16-2026 2:49:25 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

01-16-2026 3:14:53 AM

1 kudos

Hi @nkrish ,Unfortunately, I don't think so. Available accelerators you can find here:Databricks Solution Accelerators for Data & AI | DatabricksBut I haven't heard anything about accelerator for c# and qlikview specifically.

1 kudos

01-16-2026 3:14:53 AM

by deepu1 • New Contributor

01-12-2026 5:31:03 AM

617 Views
1 replies
0 kudos

Resolved! DLT Gold aggregation with apply_change

I am building a Gold table using Delta Live Tables (DLT). The Gold table contains aggregated data derived from a Silver table. Aggregation happens monthly. However, the requirement is Only the current (year, month) should be recalculated. Previous mo...

Data Engineering

617 Views
1 replies
0 kudos

01-12-2026 5:31:03 AM

View Replies

Latest Reply

aleksandra_ch
Databricks Employee

01-15-2026 12:44:59 PM

0 kudos

Hi @deepu1 , Assuming that @dlt.table refers to a Materialized View (MV), you are correct that this is the standard way to create aggregated tables in the Gold layer. A Materialized View is essentially a table that stores the results of a specific qu...

0 kudos

01-15-2026 12:44:59 PM

by PabloCSD • Valued Contributor II

01-12-2026 6:46:50 AM

790 Views
5 replies
3 kudos

Resolved! How to use/install a driver in Spark Declarative Pipelines (ETL)?

Salutations,I'm using SDP for an ETL that extracts data from HANA and put it in the Unity Catalog. I defined a Policy with the needed driver:But I get this error:An error occurred while calling o1013.load. : java.lang.ClassNotFoundException: com.sap....

Data Engineering

790 Views
5 replies
3 kudos

01-12-2026 6:46:50 AM

View Replies

Latest Reply

anshu_roy
Databricks Employee

01-15-2026 9:28:15 AM

3 kudos

At this time, Databricks does not offer native connectors for SAP HANA. You can find the complete list of managed connectors currently available in Databricks here. We generally recommend beginning with SAP’s own commercial tools, prioritizing SAP Bu...

3 kudos

01-15-2026 9:28:15 AM

4 More Replies

Databricks Community

Forum Posts

Delta table partition directories when column mapping is enabled

Unity Catalog Volume as spark checkpoint location

Reduce the Time for First Spark Streaming Run Kick off

Resolved! Bug Report: SDP (DLT) with autoloader not passing through pipe delimiter/separator

Resolved! Job fails on clusters only with library dependency

Resolved! switching to Databrick from Ab Initio (an old ETL software)- NEED ADVICE

Resolved! al purpose databricks cluster disappear

Resolved! Python DataSource API utilities/ Import Fails in Spark Declarative Pipeline

Resolved! Is Auto Loader open source now in Apache 4.1 SDP?

Resolved! Databricks free Edition to Azure Connectivity

HTML Formatting Issue in Databricks Alerts

Resolved! Job Cluster Reuse

Resolved! Regarding Accelerators

Resolved! DLT Gold aggregation with apply_change

Resolved! How to use/install a driver in Spark Declarative Pipelines (ETL)?

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template

Use .R file in data pipeline

CVE-2023-51385 and CVE-2023-38408 in Runtime 17.3 ...