cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Incremental issue in Materialized views - QUERY FINGERPRINT CHANGED

Dharinip
Contributor
I have create a Materialized view in the gold layer via DLT pipeline. But the plan is not deterministic due to the reason below. 
 
Not able to understand what does this error means and how to resolve this. Can you please help ?
 
planning_information": {
"technique_information": [
{
"incrementalization_issues": [
{
"issue_type": "QUERY_FINGERPRINT_CHANGED",
"prevent_incrementalization": true,
"fingerprint_diff_before": "\\n          \\\"id\\\" : 12\\n        },\\n        \\\"qualifier\\\" : [ ]\\n      }, {\\n        \\\"class\\\" : \\\"^^^GreaterThan\\\",\\n        \\\"num-children\\\" : 2,\\n        \\\"left\\\" : 0,\\n        \\\"right\\\" : 1\\n      }, {\\n        \\\"class\\\" : \\\"Literal\\\",\\n        \\\"num-children\\\" : 0,\\n        \\\"value\\\" : \\\"2025-07-08 04:32:01.821544\\\",\\n        \\\"dataType\\\" : \\\"timestamp\\\"\\n      }, {\\n        \\\"class\\\" : \\\"Cast\\\",\\n        \\\"num-children\\\" : 1,\\n        \\\"child\\\" : 0,\\n        \\\"dataType\\\" : \\\"timestamp\\\",\\n        \\\"timeZoneId\\\" : \\\"Etc/UTC\\\",\\n        \\\"evalMode\\\" : null\\n      }, {\\n        \\\"class\\\" : \\\"AttributeReference\\\",\\n        \\\"num-children\\\" : 0,\\n        \\\"name\\\" : \\\"none\\\",\\n        \\\"dataType\\\" : \\\"string\\\",\\n        \\\"nullable\\\" : true,\\n        \\\"metadata\\\" : { },\\n        \\\"exprId\\\" : {\\n          \\\"product-class\\\" : \\\"ExprId\\\",\\n          \\\"id\\\" : 16\\n        },\\n        \\\"qualifier\\\" : [ ]\\n      }, {\\n        \\\"class\\\" : \\\"GreaterThan\\\",\\n        \\\"num-children\\\" : 2,\\n        \\\"left\\\" : 0,\\n        \\\"right\\\" : 1\\n      }, {\\n        \\\"class\\\" : \\\"Cast\\\",\\n        \\\"num-childr",
"fingerprint_diff_current": "\\n          \\\"id\\\" : 12\\n        },\\n        \\\"qualifier\\\" : [ ]\\n      }, {\\n        \\\"class\\\" : \\\"^^^LessThan\\\",\\n        \\\"num-children\\\" : 2,\\n        \\\"left\\\" : 0,\\n        \\\"right\\\" : 1\\n      }, {\\n        \\\"class\\\" : \\\"Literal\\\",\\n        \\\"num-children\\\" : 0,\\n        \\\"value\\\" : \\\"2025-07-08 05:18:31.155472\\\",\\n        \\\"dataType\\\" : \\\"timestamp\\\"\\n      }, {\\n        \\\"class\\\" : \\\"Cast\\\",\\n        \\\"num-children\\\" : 1,\\n        \\\"child\\\" : 0,\\n        \\\"dataType\\\" : \\\"timestamp\\\",\\n        \\\"timeZoneId\\\" : \\\"Etc/UTC\\\",\\n        \\\"evalMode\\\" : null\\n      }, {\\n        \\\"class\\\" : \\\"AttributeReference\\\",\\n        \\\"num-children\\\" : 0,\\n        \\\"name\\\" : \\\"none\\\",\\n        \\\"dataType\\\" : \\\"string\\\",\\n        \\\"nullable\\\" : true,\\n        \\\"metadata\\\" : { },\\n        \\\"exprId\\\" : {\\n          \\\"product-class\\\" : \\\"ExprId\\\",\\n          \\\"id\\\" : 17\\n        },\\n        \\\"qualifier\\\" : [ ]\\n      }, {\\n        \\\"class\\\" : \\\"GreaterThan\\\",\\n        \\\"num-children\\\" : 2,\\n        \\\"left\\\" : 0,\\n        \\\"right\\\" : 1\\n      }, {\\n        \\\"class\\\" : \\\"Literal\\\",\\n        \\\"num-childr"
}
]
},
{
"maintenance_type": "MAINTENANCE_TYPE_COMPLETE_RECOMPUTE",
"is_chosen": true,
"is_applicable": true,
"cost": 543828.0
}
1 ACCEPTED SOLUTION

Accepted Solutions

mmayorga
Databricks Employee
Databricks Employee

hi @Dharinip 

Cleaning up the shared log shows "Before" and "After" fingerprints. Hopefully can give us more info in how to proceed

  • Before
[
  {
    "id": 12
  },
  {
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": 0,
    "value": "2025-07-08 04:32:01.821544",
    "dataType": "timestamp"
  },
  {
    "class": "Cast",
    "num-children": 1,
    "child": 0,
    "dataType": "timestamp",
    "timeZoneId": "Etc/UTC",
    "evalMode": null
  },
  {
    "class": "AttributeReference",
    "num-children": 0,
    "name": "none",
    "dataType": "string",
    "nullable": true,
    "metadata": {},
    "exprId": {
      "product-class": "ExprId",
      "id": 16
    },
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Cast",
    "num-children": ...
  }
]
  • After:
[
  {
    "id": 12
  },
  {
    "qualifier": []
  },
  {
    "class": "LessThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": 0,
    "value": "2025-07-08 05:18:31.155472",
    "dataType": "timestamp"
  },
  {
    "class": "Cast",
    "num-children": 1,
    "child": 0,
    "dataType": "timestamp",
    "timeZoneId": "Etc/UTC",
    "evalMode": null
  },
  {
    "class": "AttributeReference",
    "num-children": 0,
    "name": "none",
    "dataType": "string",
    "nullable": true,
    "metadata": {},
    "exprId": {
      "product-class": "ExprId",
      "id": 17
    },
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": ...
  }
]

After running a diff: 

Field/Section Before (fingerprint_diff_before) After (fingerprint_diff_current) Difference Type
class (3rd object) "GreaterThan" "LessThan" Operator changed
value (4th object) "2025-07-08 04:32:01.821544" "2025-07-08 05:18:31.155472" Timestamp changed
exprId.id (6th obj) 16 17 Expression ID changed
8th object "Cast" "Literal" Class changed/truncated

Brainstorming, may I suggest:

  1. Isolate Problematic Aggregations and Transformations
    • Start simple: Begin with a minimal version of your materialized view (MV) that includes only a few basic aggregations or transformations.
    • Incrementally add complexity: Gradually introduce additional aggregations or transformations, testing after each change.
    • Identify the trigger: Continue this process until the error reappears. This helps pinpoint the specific operation or transformation causing the issue.
  2. Check for Non-Deterministic User-Defined Functions (UDFs)
    • Review all UDFs used: Examine any UDFs used in your MV for non-deterministic behavior (e.g., functions that use randomness, current timestamps, or external state).
    • Replace or refactor: If you find non-deterministic UDFs, consider replacing them with deterministic alternatives or refactoring your logic to avoid them, as non-determinism can disrupt incremental processing and caching.
  3. Validate Input Data Consistency
    • Inspect source data: Ensure that the data being aggregated or transformed is stable and consistent between runs.
    • Check for upstream changes: Verify that referenced tables, columns, or views have not changed unexpectedly, as such changes can alter the query plan and trigger errors.

Thank you, and I hope this helps you a bit to find the issue.

View solution in original post

1 REPLY 1

mmayorga
Databricks Employee
Databricks Employee

hi @Dharinip 

Cleaning up the shared log shows "Before" and "After" fingerprints. Hopefully can give us more info in how to proceed

  • Before
[
  {
    "id": 12
  },
  {
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": 0,
    "value": "2025-07-08 04:32:01.821544",
    "dataType": "timestamp"
  },
  {
    "class": "Cast",
    "num-children": 1,
    "child": 0,
    "dataType": "timestamp",
    "timeZoneId": "Etc/UTC",
    "evalMode": null
  },
  {
    "class": "AttributeReference",
    "num-children": 0,
    "name": "none",
    "dataType": "string",
    "nullable": true,
    "metadata": {},
    "exprId": {
      "product-class": "ExprId",
      "id": 16
    },
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Cast",
    "num-children": ...
  }
]
  • After:
[
  {
    "id": 12
  },
  {
    "qualifier": []
  },
  {
    "class": "LessThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": 0,
    "value": "2025-07-08 05:18:31.155472",
    "dataType": "timestamp"
  },
  {
    "class": "Cast",
    "num-children": 1,
    "child": 0,
    "dataType": "timestamp",
    "timeZoneId": "Etc/UTC",
    "evalMode": null
  },
  {
    "class": "AttributeReference",
    "num-children": 0,
    "name": "none",
    "dataType": "string",
    "nullable": true,
    "metadata": {},
    "exprId": {
      "product-class": "ExprId",
      "id": 17
    },
    "qualifier": []
  },
  {
    "class": "GreaterThan",
    "num-children": 2,
    "left": 0,
    "right": 1
  },
  {
    "class": "Literal",
    "num-children": ...
  }
]

After running a diff: 

Field/Section Before (fingerprint_diff_before) After (fingerprint_diff_current) Difference Type
class (3rd object) "GreaterThan" "LessThan" Operator changed
value (4th object) "2025-07-08 04:32:01.821544" "2025-07-08 05:18:31.155472" Timestamp changed
exprId.id (6th obj) 16 17 Expression ID changed
8th object "Cast" "Literal" Class changed/truncated

Brainstorming, may I suggest:

  1. Isolate Problematic Aggregations and Transformations
    • Start simple: Begin with a minimal version of your materialized view (MV) that includes only a few basic aggregations or transformations.
    • Incrementally add complexity: Gradually introduce additional aggregations or transformations, testing after each change.
    • Identify the trigger: Continue this process until the error reappears. This helps pinpoint the specific operation or transformation causing the issue.
  2. Check for Non-Deterministic User-Defined Functions (UDFs)
    • Review all UDFs used: Examine any UDFs used in your MV for non-deterministic behavior (e.g., functions that use randomness, current timestamps, or external state).
    • Replace or refactor: If you find non-deterministic UDFs, consider replacing them with deterministic alternatives or refactoring your logic to avoid them, as non-determinism can disrupt incremental processing and caching.
  3. Validate Input Data Consistency
    • Inspect source data: Ensure that the data being aggregated or transformed is stable and consistent between runs.
    • Check for upstream changes: Verify that referenced tables, columns, or views have not changed unexpectedly, as such changes can alter the query plan and trigger errors.

Thank you, and I hope this helps you a bit to find the issue.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now