Delta Live Table streaming pipeline
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-27-2023 07:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-16-2024 11:51 PM
Hi @rt-slowth I would like to share with you the Databricks documentation, which contains details about stream-static table joins
https://docs.databricks.com/en/delta-live-tables/transform.html#stream-static-joins
Stream-static joins are a good choice when denormalizing a continuous stream of append-only data with a primarily static dimension table.
With each pipeline update, new records from the stream are joined with the most current snapshot of the static table. If records are added or updated in the static table after corresponding data from the streaming table has been processed, the resultant records are not recalculated unless a full refresh is performed.
In pipelines configured for triggered execution, the static table returns results as of the time the update started. In pipelines configured for continuous execution, each time the table processes an update, the most recent version of the static table is queried.
The following is an example of a stream-static join:
@dlt.table def customer_sales(): return dlt.read_stream("sales").join(read("customers"), ["customer_id"], "left")