โ08-21-2024 03:00 AM
I am creating a ETL pipeline where i am reading multiple stream table into temp tables and at the end am trying to join those tables to get the output feed into another live table. So for that am using below method where i am giving list of tables as parameter to a method. Inside method using for loop am streaming them one by one into temp tables. After that am trying to execute the sql to get the DF for my new live table.
My Python Method : -
Below are my inputs to that method
sources = ["table1","table2","table3","table4",------,"tablen"]
main_table = "parent_table"
Please help!
โ08-21-2024 06:10 AM
A stream-stream left join needs a watermark.
f.e.:
stream_df = stream_df.withWatermark("timestamp_column", "30 minutes")
joined_df = stream_df.join(other_stream_df, "join_key")
โ08-21-2024 06:17 AM
Thank you for the reply. But in my case am using sql query to read data from those temp tables. So how can i handle the water mark issue in above scenario.
โ08-21-2024 06:25 AM
try spark.read.table(main_table).withWatermark("timestamp_column", "30 minutes").createOrReplaceTempView
In SQL it is also possible using the WATERMARK function.
https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/stateful-processing
โ08-21-2024 08:52 AM - edited โ08-21-2024 08:53 AM
Do i need to water mark it while writing to temp table and in sql statement too
โ08-22-2024 12:13 AM
it is necessary for the join so if the dataframe has a watermark that's enough.
No need to define it multiple times.
โ08-22-2024 01:19 PM
I did put watermark on the data frame but still getting the same error while executing sql
โ08-22-2024 11:16 PM
Sorry, I put the watermark on the non-streaming table. That is wrong of course, the watermark has to be set on the streaming table (source in this case).
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group