DLT refresh time for combination of streaming and non streaming tables?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sunday
@dlt.table
def joined_table():
dim_df = spark.read.table("dim_table") # Reloads every batch
fact_df = spark.readStream.table("fact_stream")
return fact_df.join(dim_df, "id", "left")
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sunday
the question is default DLT pipeline refresh time is 5seconds but if I use combination of streaming and non streaming data then will it still be 5 seconds?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Hello @surajitDE!
When using both streaming and batch data, the pipeline may not always refresh every 5 seconds. While the streaming table (fact_stream) updates every 5 seconds, the batch table (dim_table) fully reloads each time, adding overhead from repeatedly loading the batch data.
The actual refresh time depends on the size of dim_table, larger tables take longer to reload, which can delay updates.

