cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Static Table Creation in DLT

Anubhav2011
New Contributor II

We're encountering a specific issue in our DLT pipeline and would appreciate some advice. Here's an example to illustrate the challenge we're facing:

Tables Overview

  • Material Master: Contains comprehensive material data updated daily with new records. We aim to upsert data into the silver layer using SCD Type 1.
  • Material Description: Serves as a lookup table with descriptions for each material number, and gets updated once or twice a month. Similarly, we plan to upsert this data in the silver layer using SCD Type 1.

Pipeline Description

Our objective is to construct these tables in the silver layer through 1:1 mapping and upsert operations from our raw tables in the bronze layer. Subsequently, we join these tables to populate our core layer via another upsert operation. Our bronze table is a normal delta table.

Problem Statement

Currently, both tables are implemented as streaming tables within the silver layer of our DLT pipeline. When attempting a left join between these streaming tables, we encounter errors. This happens because streaming joins require specifying event timestamp intervals and watermarks. This is not applicable on the lookup table.

One alternative approach is to perform a Stream-to-Static join, which necessitates converting the lookup table (material_description) into a static table. However, we find no option in DLT to create non-streaming tables; even omitting the term "Streaming" results in the creation of a materialized view.

In the silver layer, we prefer all objects to be tables only, without resorting to solutions outside DLT, such as creating lookup tables as ordinary delta tables and executing merge operations in a separate notebook.

Could you provide insights or suggestions on how to overcome these limitations while remaining within the DLT framework?

1 REPLY 1

ilir_nuredini
Honored Contributor

Hello @Anubhav2011 ,

From your question, it means that you want the output to appear in the Catalog UI as an actual Table, not a Materialized View (MV). In DLT, datasets derived from other DLT datasets are shown as MVs (or Streaming Table). Theyโ€™re still backed by Delta tables and are managed by Unity Catalog, but DLT controls the refresh strategy and caches results, which lowers query latency and resource usage. This behavior is by design in DLT, intended to simplify and optimize ETL workflows, rather than a limitation. And I don't think there is a way to achieve that workaround.

If you still need a plain table (not shown as MV, but actual Table in UC UI), write it outside DLT logic by reading the DLT output and saving it explicitly:

 

df = spark.table("workspace.default.processed_trip_mv") # DLT output

# Custom Silver logic...

df.write.mode("overwrite").saveAsTable("workspace.default.processed_trip_actual_table")

 


Hope that helps, happy to clarify further.

Best, Ilir

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now