cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Incrementalization issue in Materialized views

Dharinip
New Contributor III

I am trying to implement the incremental updates to the Materialized views. The source is the Could you tell how to resolve the following issue? 

{
  "planning_information": {
    "technique_information": [
      {
        "maintenance_type": "MAINTENANCE_TYPE_ROW_BASED",
        "incrementalization_issues": [
          {
            "issue_type": "PLAN_NOT_INCREMENTALIZABLE",
            "prevent_incrementalization": true,
            "operator_name": "DataSourceV2Relation",
            "plan_not_incrementalizable_sub_type": "OPERATOR_NOT_SUPPORTED"
          }
        ]
      },
      {
        "maintenance_type": "MAINTENANCE_TYPE_COMPLETE_RECOMPUTE",
        "is_chosen": true,
        "is_applicable": true,
        "cost": 1.222099115578718e22
      },
      {
        "incrementalization_issues": [
          {
            "issue_type": "PLAN_NOT_DETERMINISTIC",
            "prevent_incrementalization": true,
            "operator_name": "DataSourceV2Relation"
          },
          {
            "issue_type": "INPUT_NOT_IN_DELTA",
            "prevent_incrementalization": true
          }
        ]
      }

3 REPLIES 3

Brahmareddy
Honored Contributor II

Hi Dharinip,

How are you doing today? As per my understanding, it looks like your Materialized View is falling back to full recompute because the source data or query isn't eligible for incremental updates. Based on the message, there are a few blockers: one is that your source is not in Delta format ("INPUT_NOT_IN_DELTA"), which is required for incremental refresh. Another issue is that the query plan includes an unsupported operator (DataSourceV2Relation), and parts of your query might also be considered non-deterministic, meaning the system can’t safely track changes over time. To fix this, try making sure your source table is a Delta table registered in Unity Catalog, and review your query to remove any complex logic or unsupported sources like external file reads (CSV, Parquet directly, etc.). Keeping the logic simple and based on Delta helps Databricks process only new or changed data. Let me know if you’d like help reviewing your query!

Regards,

Brahma

Dharinip
New Contributor III

My source table is a streaming table and I did not perform complex transformation. 

1. Reading the source table 

2. Filtering the data (only latest records)

3. Performing column name changes.

My target table is Materialized view:
def my_incremental_view():
# STEP 1 - To read only the active records
src_silver_df = spark.read.table(f"{catalog}.{src_schema}.{src_table}")
src_silver_df = src_silver_df.filter(src_silver_df.__END_AT.isNull())

# STEP 2 - To read only the latest records out of activ erecords
df = spark.sql("SELECT StartDate FROM c_realestate_dev.gold.goldtables WHERE TableName = 'modeling_incre'")
last_timestamp = df.first()['StartDate']
latest_records_source_df = src_silver_df.filter(src_silver_df.__START_AT >= last_timestamp)

# STEP 3 - To transform only the latest records
transformed_df = table_ColumnName_Initial_transformation(latest_records_source_df)

return transformed_df


@Dlt.table( name= final_table,
schema = schema_enrichment,
comment=f"This table contains information about all the Enrichment Location summary Information for each Client",
spark_conf = { "pipelines.incompatibleViewCheck.enabled" : "false" },
table_properties={
"quality": "gold",
"pipelines.autoOptimize.managed": "true"
})

def load_silver_data():

base_table = spark.read.table("my_incremental_view")

return base_table

 

How to perform the incremental load?

 

 

Brahmareddy
Honored Contributor II

Hi Dharanip,

From what you’ve described, your logic looks good, but the issue comes from how Materialized Views work in Databricks. They only support incremental updates when the query is very simple and fully predictable (or "deterministic"). Since you're using dynamic filtering with last_timestamp (which changes every run) and reading from a streaming table, Databricks sees the plan as too complex or not supported for incremental refresh.

A better option for your use case would be to use Delta Live Tables (DLT) with streaming. DLT is built to handle incremental data out of the box—you just read your Silver table as a stream, filter the active records, apply your transformations, and DLT takes care of the rest. It’s more flexible than Materialized Views for this kind of logic. 

Regards,

Brahma

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now