Databricks Community

Dharinip · ‎03-09-2025

Is incremental load possible on Materialized views. I would like to get some tutorials or videos on how to perform incremental refresh on MVs in gold layers. Also is it mandatory to have PKs for performing incremental loads in MVs.

DHARINI PARGUNAN

koji_kawamura · ‎05-02-2025

Hi @Dharinip , excuse me for the super late reply! I had been very busy last month...

Thanks for providing the additional details. It seems you would like to have both SCD Type 1 and 2.

Are you using a DLT pipeline to implement this? I assume so, because my previous demo example didn't resonate with you, which used a standalone materialized view approach.

So, I just created a simple DLT pipeline based on your example. Having a bronze table (client_raw), apply changes into a SCD type 2 streaming table (client_history), then create a MV (client_current). The client_current MV should be updated incrementally using a Serverless used. Related doc can be found here.

I've uploaded the sample DLT source code here. I hope it suites your need. https://gist.github.com/koji-kawamura-db/999b7fb9e114a236c15442776c5c0a0f

BTW, creating both type 1 and 2 as two Streaming Tables might be an alternative solution. So, the sample pipeline includes such a route just in case.

The DLT pipeline looks like this:

View solution in original post

koji_kawamura · ‎03-09-2025

Hi @Dharinip

Yes, Databricks materialized views support incremental updates. You can refer this doc page for details. It uses a DLT pipeline internally and a serverless SQL warehouse is required. It utilizes row tracking and PKs are not mandatory.

I created a very simple example notebook to demonstrate updating a materialized view incrementally.

Dharinip · ‎03-18-2025

My question is: can materialized views be updated incrementally. For example:

In my case, we store the data in Iron layer and it gets flattened in bronze and silver layer as separate tables. The required transformations happen from silver to gold layer.

All the Iron, bronze and silver layers are streaming tables as SCD type 2 and gets incrementally loaded.

The Gold layer are materialized views. Every time the pipeline runs, the materialized views are fully refreshed. But we wanted to have those incrementally loaded.
Example:

SILVER

Table: Client

ClientID	ClientName	Address	Status	CreatedDate	StartDate	Enddate	RunDate
1	ABC	Florida	New	03.15.2025	03.15.2025	03.16.2025	03.15.2025
1	ABC	Miami	New	03.16.2025	03.16.2025	03.17.2025	03.16.2025
1	ABC	Miami	Active	03.17.2025	03.17.2025	NULL	03.17.2025
2	QWE	Kansas	Active	03.15.2025	03.15.2025	03.18.2025	03.15.2025
2	QWE	Kansas	Inactive	03.18.2025	03.18.2025	NULL	03.18.2025
3	GHJ	Buffalo	Active	03.18.2025	03.18.2025	NULL	03.18.2025


GOLD

Materialized View: Client

ID	NAME	Address	Status	CreatedDate	StartDate	Enddate	RunDate
1	ABC	Miami	Active	03.17.2025	03.17.2025	NULL	03.17.2025	These should also be incrementally loaded.
2	QWE	Kansas	Inactive	03.18.2025	03.18.2025	NULL	03.18.2025
3	GHJ	Buffalo	Active	03.18.2025	03.18.2025	NULL	03.18.2025