Data Engineering

Forum Posts

Sorted by:

by Michał • New Contributor III

09-03-2025 6:41:10 AM

805 Views
5 replies
3 kudos

how to process a streaming lakeflow declarative pipeline in batches

Hi, I've got a problem and I have run out of ideas as to what else I can try. Maybe you can help? I've got a delta table with hundreds millions of records on which I have to perform relatively expensive operations. I'd like to be able to process some...

Data Engineering

805 Views
5 replies
3 kudos

09-03-2025 6:41:10 AM

View Replies

Latest Reply

mmayorga
Databricks Employee

a month ago

3 kudos

Hi @Michał , One detail/feature to consider when working with Declarative Pipelines is that they manage and auto-tune configuration aspects, including rate limiting (maxBytesPerTrigger or maxFilesPerTrigger). Perhaps that's why you could not see this...

3 kudos

a month ago

4 More Replies

by Data_NXT • New Contributor III

09-17-2025 10:51:05 AM

530 Views
3 replies
3 kudos

Resolved! To change ownership of a materialized view

working in a Unity Catalog-enabled Databricks workspace, and we have several materialized views (MVs) that were created through a Delta Live Tables (DLT) / Lakeflow pipeline.Currently, the original owner of the pipeline has moved out of the project,...

Data Engineering

530 Views
3 replies
3 kudos

09-17-2025 10:51:05 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-17-2025 11:05:25 AM

3 kudos

Hi @Data_NXT ,You can change the owner of a materialized view if you are a both a metastore admin and a workspace admin.Use the following steps to change a materialized views owner:Open the materialized view in Catalog Explorer, then on the Overview ...

3 kudos

09-17-2025 11:05:25 AM

2 More Replies

by Hritik_Moon • New Contributor II

a week ago

399 Views
2 replies
2 kudos

Save as Delta file in catalog

Hello, I have created data frame on csv file when I try to write it as:df_op_clean.write.format("delta").save("/Volumes/optimisation/trial")I get this error :Cannot access the UC Volume path from this location. Path was /Volumes/optimisation/trial/_d...

Data Engineering

399 Views
2 replies
2 kudos

a week ago

View Replies

Latest Reply

-werners-
Esteemed Contributor III

a week ago

2 kudos

Also to add on this:avoid overlap between tables and Volumes.Create a separate folder for tables and files.Unity catalog does this too if you use managed tables/volumes.

2 kudos

a week ago

1 More Replies

by mbanxp • New Contributor III

2 weeks ago

257 Views
2 replies
1 kudos

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

Hi there !!! I would like to find the most suitable orchestration process to promote data between medallion layers I need to solve the following key architectural decision for scaling my multi-tenant data lake in Databricks.My setup:Independent medal...

Data Engineering

257 Views
2 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

sarahbhord
Databricks Employee

a week ago

1 kudos

Hey mbanxp! The most scalable and maintainable orchestration pattern for multi-tenant medallion architectures in Databricks is to build independent pipelines per table for all clients, with each pipeline parameterized by client/tenant. Why this appro...

1 kudos

a week ago

1 More Replies

by jeremy98 • Honored Contributor

06-26-2025 10:17:52 AM

917 Views
6 replies
1 kudos

How to reference a workflow to use multiple GIT sources?

Hi community,Is it possible for a workflow to reference multiple Git sources? Specifically, can different tasks within the same workflow point to different Git repositories or types of Git sources?Ty

Data Engineering

917 Views
6 replies
1 kudos

06-26-2025 10:17:52 AM

View Replies

Latest Reply

mai_luca
New Contributor III

06-27-2025 5:37:05 AM

1 kudos

A workflow can reference multiple Git sources. You can specify the git information for each task. However, I am not sure you can have multiple GitProvider for the same workspace....

1 kudos

06-27-2025 5:37:05 AM

5 More Replies

by EricCournarie • New Contributor III

2 weeks ago

564 Views
8 replies
10 kudos

ResultSet metadata does not return correct type for TIMESTAMP_NTZ

Hello, using the JDBC driver, when I retrieve the metadata of a ResultSet, the type for a TIMESTAMP_NTZ is not correct (it's a TIMESTAMP one).My SQL is a simple SELECT * on a table where you have a TIMESTAMP_NTZ columnThis works when retrieving metad...

Data Engineering

564 Views
8 replies
10 kudos

2 weeks ago

View Replies

Latest Reply

Advika
Databricks Employee

a week ago

10 kudos

Hello @EricCournarie! Just to confirm, were you initially using the JDBC driver v2.7.3? According to the release notes, this version adds support for the TIMESTAMP_NTZ data type.

10 kudos

a week ago

7 More Replies

by karuppusamy • New Contributor II

a week ago

424 Views
4 replies
5 kudos

Resolved! Getting an warning message in Declarative Pipelines.

Hi Team,While creating a Declarative ETL pipeline in Databricks, I tried to configure a notebook using the "Add existing assets" option by providing the notebook path. However, I received a warning message:"Legacy configuration detected. Use files in...

Data Engineering

424 Views
4 replies
5 kudos

a week ago

View Replies

Latest Reply

karuppusamy
New Contributor II

a week ago

5 kudos

Thank you @szymon_dybczak, Now I have a good clarification from my end.

5 kudos

a week ago

3 More Replies

by Raj_DB • Contributor

a week ago

558 Views
8 replies
7 kudos

Resolved! Streamlining Custom Job Notifications with a Centralized Email List

Hi Everyone,I am working on setting up success/failure notifications for a large number of jobs in our Databricks environment. The manual process of configuring email notification using UI for each job individually is not scalable and is becoming ver...

Data Engineering

558 Views
8 replies
7 kudos

a week ago

View Replies

Latest Reply

nayan_wylde
Honored Contributor III

a week ago

7 kudos

@Raj_DB Databricks sends notifications via its internal email service, which often requires the address to be a valid individual mailbox or a distribution list that accepts external mail. If your group email is a Microsoft 365, Please check if “Allow...

7 kudos

a week ago

7 More Replies

by EricCournarie • New Contributor III

a week ago

257 Views
2 replies
0 kudos

Filling a STRUCT field with a PreparedStatement in JDBC

Hello, I'm trying to fill a STRUCT field with a PreparedStatement in Java by giving a JSON string in the PreparedStatement.But it complains Cannot resolve "infos" due to data type mismatch: cannot cast "STRING" to "STRUCT<AGE: BIGINT, NAME: STRING>"....

Data Engineering

257 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

a week ago

0 kudos

Could you provide a sample of JSON string along with a code you're using? Otherwise it will be hard for us to help you.

0 kudos

a week ago

1 More Replies

by yit • Contributor III

a week ago

295 Views
2 replies
3 kudos

Resolved! Difference between libraries dlt and dp

In all Databricks documentation, the examples use import dlt to create streaming tables and views. But, when generating sample Python code in ETL pipeline, the import in the sample is:import pyspark import pipelines as dpWhich one is the correct libr...

Data Engineering

295 Views
2 replies
3 kudos

a week ago

View Replies

Latest Reply

nayan_wylde
Honored Contributor III

a week ago

3 kudos

@yit Functionally, they are equivalent concepts (declarative definitions for streaming tables, materialized views, expectations, CDC, etc.). The differences you’ll notice are mostly naming/ergonomics:Module name:Databricks docs & most existing notebo...

3 kudos

a week ago

1 More Replies

by ralphchan • New Contributor II

02-14-2025 4:56:02 AM

3645 Views
5 replies
0 kudos

Connect Oracle Fusion (ERP / HCM) to Databricks

Any suggestion to connect Oracle Fusion (ERP/HCM) to Databricks?I have explored a few options including the use of Oracle Integration Cloud but it requires a lot of customization.

Data Engineering

3645 Views
5 replies
0 kudos

02-14-2025 4:56:02 AM

View Replies

Latest Reply

NikhilKamble
New Contributor II

a week ago

0 kudos

Hey Ralph,Orbit datajump is one of the good options in the market. Try it out.

0 kudos

a week ago

4 More Replies

by kenmyers-8451 • Contributor

05-19-2025 11:00:59 AM

880 Views
4 replies
0 kudos

bug with using parameters in a sql task

I am trying to make a sql task that runs using a serverless sql warehouse that takes a variable and uses that in the sql file that it is running in a serverless warehouse, however I am getting errors because databricks keeps formatting it first with ...

Data Engineering

880 Views
4 replies
0 kudos

05-19-2025 11:00:59 AM

View Replies

Latest Reply

asuvorkin
New Contributor II

a week ago

0 kudos

I have been trying to use templates as well and got the following string:LOCATION 's3://s3-company-data-' dev '-' 1122334455 '-eu-central-1/path_to_churn/main/'

0 kudos

a week ago

3 More Replies

by data-grassroots • New Contributor III

2 weeks ago

291 Views
4 replies
1 kudos

Resolved! ExcelWriter and local files

I have a couple things going on here.First, to explain what I'm doing, I'm passing an array of objects in to a function that contain a dataframe per item. I want to write those dataframes to an excel workbook - one dataframe per worksheet. That part ...

Data Engineering

291 Views
4 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

Advika
Databricks Employee

a week ago

1 kudos

Hello @data-grassroots! Were you able to resolve this? If any of the suggestions shared above helped, or if you found another solution, it would be great if you could mark it as the accepted solution or share your approach with the community.

1 kudos

a week ago

3 More Replies

by fjrodriguez • New Contributor III

a week ago

253 Views
5 replies
4 kudos

Resolved! Self Dependency TumblingWindowTrigger in adf

Hey !I would like to migrate one ADF batch ingestion which has a TumblingWindowTrigger on top of the pipeline which pretty much check each 15 min if a file is landing, normally the files land in daily basis so will process it accordingly once in a d...

Data Engineering

253 Views
5 replies
4 kudos

a week ago

View Replies

Latest Reply

fjrodriguez
New Contributor III

a week ago

4 kudos

Hi @szymon_dybczak ,sounds reasonable, will propone this approach. Thanks

4 kudos

a week ago

4 More Replies

by Hritik_Moon • New Contributor II

a week ago

638 Views
12 replies
17 kudos

Accessing Spark UI in free edition

Hello, is it possible to access Spark UI in free edition, I want to check task and stages.Ultimately I am working on how to check data skewness.

Data Engineering

638 Views
12 replies
17 kudos

a week ago

View Replies

Latest Reply

Hritik_Moon
New Contributor II

a week ago

17 kudos

@szymon_dybczak @BS_THE_ANALYST Is there a specific guide or a flow to be a better databricks data engineer. I am learning as the topic comes up.Finding it really difficult to maintain a flow and I lose track.

17 kudos

a week ago

11 More Replies

Databricks Community

Forum Posts

how to process a streaming lakeflow declarative pipeline in batches

Resolved! To change ownership of a materialized view

Save as Delta file in catalog

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

How to reference a workflow to use multiple GIT sources?

ResultSet metadata does not return correct type for TIMESTAMP_NTZ

Resolved! Getting an warning message in Declarative Pipelines.

Resolved! Streamlining Custom Job Notifications with a Centralized Email List

Filling a STRUCT field with a PreparedStatement in JDBC

Resolved! Difference between libraries dlt and dp

Connect Oracle Fusion (ERP / HCM) to Databricks

bug with using parameters in a sql task

Resolved! ExcelWriter and local files

Resolved! Self Dependency TumblingWindowTrigger in adf

Accessing Spark UI in free edition

Join Us as a Local Community Builder!

DAB + DLT destroy fails due to ownership/permissio...

Can't enable "variantType-preview" using DLTs

Liquid Clustering With Merge

deadlock occurs with use statement

is there another way to authen to azure databricks...