Data Engineering

Forum Posts

Sorted by:

by mriccardi • New Contributor II

07-26-2022 6:10:34 AM

4850 Views
4 replies
1 kudos

Spark streaming: Checkpoint not recognising new data

Hello everyone!We are currently facing an issue with a stream that is not updating new data since the 20 of July.We've validated and bronze table has data that silver doesn't have.Also seeing the logs the silver stream is running but writing 0 files....

Data Engineering

4850 Views
4 replies
1 kudos

07-26-2022 6:10:34 AM

View Replies

Latest Reply

mriccardi
New Contributor II

07-26-2022 6:15:11 AM

1 kudos

Also the trigger is configured to run once, but when we start the job it never ends, it keeps in an endless loop.

1 kudos

07-26-2022 6:15:11 AM

3 More Replies

by rlink • New Contributor II

05-01-2023 2:48:55 PM

4518 Views
3 replies
2 kudos

Resolved! Data Science & Engineering Dashboard Refresh Issue Using Databricks

Hi everyone,I create a Data Science & Engineering notebook in databricks to display some visualizations and also set up a schedule for the notebook to run every hour. I can see that the scheduled run is successful every hour, but the dashboard I crea...

Data Engineering

4518 Views
3 replies
2 kudos

05-01-2023 2:48:55 PM

View Replies

Latest Reply

luis_herrera
Databricks Employee

05-03-2023 4:44:28 AM

2 kudos

To schedule a dashboard to refresh at a specified interval, schedule the notebook that generates the dashboard graphs.PS: Check #DAIS2023 talks

2 kudos

05-03-2023 4:44:28 AM

2 More Replies

by Mado • Valued Contributor II

01-09-2023 10:37:24 PM

5593 Views
4 replies
3 kudos

Resolved! Streaming Delta Live Table, if I re-run the pipeline, does it append the new data to the current table?

Hi,I have a question about DLT table. Assume that I have a streaming DLT pipeline which reads data from a Bronze table and apply transformation on data. Pipeline mode is triggered. If I re-run the pipeline, does it append new data to the current tabl...

Data Engineering

5593 Views
4 replies
3 kudos

01-09-2023 10:37:24 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 6:13:39 AM

3 kudos

@Mohammad Saber :In a Databricks Delta Lake (DLT) pipeline, when you re-run the pipeline in "append" mode, new data will be appended to the existing table. Delta Lake provides built-in support for handling duplicates through its "upsert" functionali...

3 kudos

04-10-2023 6:13:39 AM

3 More Replies

by pranathisg97 • New Contributor III

02-15-2023 4:59:57 AM

4928 Views
7 replies
0 kudos

Resolved! Fetch new data from kinesis for every minute.

I want to fetch new data from kinesis source for every minute. I'm using "minFetchPeriod" option and specified 60s. But this doesn't seem to be working.Streaming query: spark \ .readStream \ .format("kinesis") \ .option("streamName", kinesis_stream_...

Data Engineering

4928 Views
7 replies
0 kudos

02-15-2023 4:59:57 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 6:04:19 PM

0 kudos

Hi @Pranathi Girish Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...

0 kudos

03-10-2023 6:04:19 PM

6 More Replies

by Bin • New Contributor

08-21-2022 10:36:23 PM

11208 Views
0 replies
0 kudos

How to do an "overwrite" output mode using spark structured streaming without deleting all the data and the checkpoint

I have this delta lake in ADLS to sink data through spark structured streaming. We usually append new data from our data source to our delta lake, but there are some cases when we find errors in the data that we need to reprocess everything. So what ...

Data Engineering

11208 Views
0 replies
0 kudos

08-21-2022 10:36:23 PM

Databricks Community

Spark streaming: Checkpoint not recognising new data

Resolved! Data Science & Engineering Dashboard Refresh Issue Using Databricks

Resolved! Streaming Delta Live Table, if I re-run the pipeline, does it append the new data to the current table?

Resolved! Fetch new data from kinesis for every minute.

How to do an "overwrite" output mode using spark structured streaming without deleting all the data and the checkpoint