cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

LDP Materialized View Incremental Refreshes - Changeset Size Thresholds

mdee
Databricks Partner

Is there any documentation available around the changeset size thresholds for materialized view incremental refreshes?  Are these configurable at all?  Are they constant or do the thresholds change depending on the number of rows/size of the materialized view? 

We frequently see MVs perform full refreshes with the cost_model_rejection_subtype being "CHANGESET_SIZE_THRESHOLD_EXCEEDED".  These MVs are already running every ~15 minutes so they can't be run more frequently without moving them to a streaming/continuous model, and the changeset sizes are modest relative to total data size.  Curious if there would be a way to increase that threshold and not have to the force the MV into an incremental refresh using a refresh policy, or just generally if there is more information available around what the threshold is/how it's determined.

2 REPLIES 2

pradeep_singh
Contributor III

There isn’t a user-facing setting to tune the internal changeset-size threshold. If you want the system to strongly prefer incremental refresh whenever it’s possible, you can:

For SDP pipelines, use the pipelines.enzyme.preferIncrementalFlows setting to bias the cost model toward incremental for specific materialized views.

In SQL, define the MV with `REFRESH POLICY INCREMENTAL` (or `REFRESH POLICY INCREMENTAL STRICT` if you’d rather the refresh fail than fall back to a full recompute). These don’t expose the threshold itself, but they give you control over whether the cost model is allowed to choose a full refresh.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

emma_s
Databricks Employee
Databricks Employee

Hi,

On top of Pradeep's reply, which I'd recommend trying, I'd also suggest you raise a support ticket for this. They will potentially be able to tweak the settings in the backend (not guaranteed), but it may help.

Thanks,

Emma