Data Engineering

Forum Posts

Sorted by:

by KKo • Contributor III

01-06-2023 8:43:58 PM

3423 Views
2 replies
2 kudos

delete and append in delta path

I am deleting data from curated path based on date column and appending staged data on it on each run, using below script. My fear is, just after the delete operation, if any network issue appeared and the job stopped before it appended the staged da...

Data Engineering

3423 Views
2 replies
2 kudos

01-06-2023 8:43:58 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

01-07-2023 8:01:02 AM

2 kudos

thanks man

2 kudos

01-07-2023 8:01:02 AM

1 More Replies

by Jack • New Contributor II

06-02-2022 7:44:33 AM

5810 Views
1 replies
1 kudos

Append an empty dataframe to a list of dataframes using for loop in python

I have the following 3 dataframes:I want to append df_forecast to each of df2_CA and df2_USA using a for-loop. However when I run my code, df_forecast is not appending: df2_CA and df2_USA appear exactly as shown above.Here’s the code:df_list=[df2_CA,...

Data Engineering

5810 Views
1 replies
1 kudos

06-02-2022 7:44:33 AM

View Replies

Latest Reply

User16764241763
Honored Contributor

06-05-2022 9:36:22 PM

1 kudos

@Jack Homareau Can you try union functionality with dataframes?https://sparkbyexamples.com/pyspark/pyspark-union-and-unionall/and then try to fill NaNs with the desired values?

1 kudos

06-05-2022 9:36:22 PM

by _Orc • New Contributor

03-02-2022 12:19:52 PM

4658 Views
2 replies
1 kudos

Resolved! Checkpoint is getting created even the though the microbatch append has failed

Use caseRead data from source table using structured spark streaming(Round the clock).Apply transformation logic etc etc and finally merge the dataframe in the target table.If there is any failure during transformation or merge ,databricks job should...

Data Engineering

4658 Views
2 replies
1 kudos

03-02-2022 12:19:52 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-12-2022 9:34:32 AM

1 kudos

Hi @Om Singh Hope you are doing well. Just wanted to check in and see if you were able to find a solution to your question?Cheers

1 kudos

04-12-2022 9:34:32 AM

1 More Replies

by KKDataEngineer • New Contributor III

01-28-2022 8:21:11 AM

2323 Views
0 replies
2 kudos

Spark Structred Streaming, An Aggregation DF with Watermark in Append mode to Delta table is not writing the most recent aggregation to the Delta table even after crossing the water mark boundary. This is causing dataloss

Team, I am struggling with a unique issue. I am not sure if my understanding is wrong or this is a bug with spark. I am reading a stream from events hub ( Extract) Pivoting and Aggregating the above dataframe ( Transformation). This is a WATERMARKED...

Data Engineering

2323 Views
0 replies
2 kudos

01-28-2022 8:21:11 AM

by MiguelKulisic • New Contributor II

01-21-2022 1:52:10 PM

9430 Views
2 replies
4 kudos

Resolved! ProtocolChangedException on concurrent blind appends to delta table

Hello, I am developing an application that runs multiple processes that write their results to a common delta table as blind appends. According to the docs I've read online: https://docs.databricks.com/delta/concurrency-control.html#protocolchangedex...

Data Engineering

9430 Views
2 replies
4 kudos

01-21-2022 1:52:10 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

01-25-2022 6:36:33 AM

4 kudos

I think you are right, the mergeSchema will change the schema of the table, but if you both write to that same table with another schema, which one will it be?Can you check if both of you actually write the same schema, or remove the mergeschema?

4 kudos

01-25-2022 6:36:33 AM

1 More Replies

Databricks Community

delete and append in delta path

Append an empty dataframe to a list of dataframes using for loop in python

Resolved! Checkpoint is getting created even the though the microbatch append has failed

Spark Structred Streaming, An Aggregation DF with Watermark in Append mode to Delta table is not writing the most recent aggregation to the Delta table even after crossing the water mark boundary. This is causing dataloss

Resolved! ProtocolChangedException on concurrent blind appends to delta table