cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Clarity on usage STREAM while defining DLT tables

lokeshr
New Contributor

Hi, I am currently trying to learn Databricks and going through tutorials and learning materials. I came across this link https://databricks.com/discover/pages/getting-started-with-delta-live-tables

While I get most of what is described in page, I find it hard to understand why while building silver tier one of the bronze tables, sales_orders_raw, is mentioned with keyword STREAM other bronze table,customers, is just using marker LIVE. Shouldn't both be marked with STREAM as well as LIVE. Is this some typo?

Regards,

Lokesh

2 REPLIES 2

tomasz
New Contributor III
New Contributor III

This is because in the example "sales_orders" data is being streamed, joined (using left join) to customers, and being appended to the silver layer table. When a sales_order comes in from a customer that was inserted some time ago (rather than in the current micro-batch being processed) the entire customer table has to be loaded to find that customer id and name. Therefore using LIVE.customers without "STREAMING" allows the join to be a stream-batch join (as described here).

Essentially because you only need the most recent records coming in from "sales_orders" you can use the "STREAM" keyword but the join requires the entire customer table to be loaded and hence the lack of the "STREAM" keyword there.

On the other side of the coin, you need to update the silver layer table only when a new sales_order comes in, not when a new customer is streamed into the bronze layer. That's another reason why you only need the STREAM on the sales_order table.

jose_gonzalez
Moderator
Moderator

Hi @Lokesh Raju​,

Just a friendly follow-up. Did Tomasz's response help you to resolved your question? If it did, please mark it as best.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.