cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get a snapshot of a streaming delta table as a static table?

Mado
Valued Contributor II

Hi,

Assume that I have a streaming delta table. Is there any way to get snapshot of the streaming table as a static table?

Reason is that I need to join this streaming table with a static table by:

output = output.join(country_information, ["Country"], "left")

"Output" is a static table and "country_information" is the streaming table.

But, I get the error:

Append mode error: LeftOuter join with a streaming DataFrame/Dataset on the right and a static DataFrame/Dataset on the left is not supported;

Also, I cannot change the the above code into:

country_information = country_information.join(output, ["Country"], "left")

Because the logic doesn't work.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Mohammad Saber​, Yes, you can try this approach

Create the snapshot with a timestamp

snapshot_time = "2022-10-01 00:00:00"
 
spark.sql(f"CREATE TABLE snapshot_table_at_time AS SELECT * FROM streaming_table VERSION AS OF '{snapshot_time}'")

Then, you can save it as a data frame.

snapshot_df = spark.read.format("delta").option("timestampAsOf", snapshot_time).load("streaming_table").toDF()
 

Thank you for your question. Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

View solution in original post

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Mohammad Saber​, Yes, you can try this approach

Create the snapshot with a timestamp

snapshot_time = "2022-10-01 00:00:00"
 
spark.sql(f"CREATE TABLE snapshot_table_at_time AS SELECT * FROM streaming_table VERSION AS OF '{snapshot_time}'")

Then, you can save it as a data frame.

snapshot_df = spark.read.format("delta").option("timestampAsOf", snapshot_time).load("streaming_table").toDF()
 

Thank you for your question. Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.