cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issues recreating Tables with enableRowTracking and DBR16.4 and below

DarioB
New Contributor III

We are running a Deep Clone script to copy Catalogs between Environments; this script is run through a job (run by SP) with DBR 16.4.12.

Some tables are Deep Cloned and other ones are Dropped and Recreated to load partial data. The ones dropped are recreated through a "SHOW CREATE TABLE" result. The issue comes with certain tables with enableRowTracking that it retrieves both 'delta.rowTracking.materializedRowCommitVersionColumnName' and 'delta.rowTracking.materializedRowIdColumnName' which are table properties not allowed in creation, which results in:

[DELTA_UNKNOWN_CONFIGURATION] Unknown configuration was specified: delta.rowTracking.materializedRowCommitVersionColumnName

 We have most of the tables with enableRowTracking activated and it does fails only with some of thems. I attach here the CREATE statement of a failing table:

CREATE TABLE Catalog.Schema.Table (
  XXX STRING,
  XXX STRING,
  XXX TIMESTAMP,
  XXX STRING,
  XXX STRING,
  XXX STRING,
  XXX INT,
  XXX DATE,
  XXX STRING,
  XXX STRING,
  XXX STRING,
  XXX STRING,
  XXX TIMESTAMP)
USING delta
TBLPROPERTIES (
  'delta.columnMapping.mode' = 'name',
  'delta.enableChangeDataFeed' = 'true',
  'delta.enableDeletionVectors' = 'true',
  'delta.enableRowTracking' = 'true',
  'delta.feature.appendOnly' = 'supported',
  'delta.feature.changeDataFeed' = 'supported',
  'delta.feature.columnMapping' = 'supported',
  'delta.feature.deletionVectors' = 'supported',
  'delta.feature.domainMetadata' = 'supported',
  'delta.feature.invariants' = 'supported',
  'delta.feature.rowTracking' = 'supported',
  'delta.minReaderVersion' = '3',
  'delta.minWriterVersion' = '7',
  'delta.rowTracking.materializedRowCommitVersionColumnName' = '_row-commit-version-col-UID',
  'delta.rowTracking.materializedRowIdColumnName' = '_row-id-col-UID')

It's true that we have tested with DBR 17.3 and it works, since it removes those properties; but we are not yet ready to migrate to Scala 2.13 and thus we want to understand if there is some interaction into the Delta Tables features that forces this error and how to avoid.

1 ACCEPTED SOLUTION

Accepted Solutions

Louis_Frolio
Databricks Employee
Databricks Employee

Happy Monday @DarioB , I did some digging and would like to provide you with some helpful hints/tips.

Thanks for the detailed context—this is a known rough edge in DBR 16.x when recreating tables that have row tracking materialized.

What’s happening and why it only fails for some tables

  • The two keys you see in SHOW CREATE TABLE—delta.rowTracking.materializedRowCommitVersionColumnName and delta.rowTracking.materializedRowIdColumnName—are internal, auto-generated names of the materialized metadata columns used by the row tracking feature. They are not user-settable and are rejected during CREATE/REPLACE, hence the DELTA_UNKNOWN_CONFIGURATION error.

  • In DBR 16.x, SHOW CREATE TABLE sometimes includes these internal keys in the TBLPROPERTIES output, which tempts scripts to replay them into a CREATE TABLE. That is a bug that the Delta team tracked and fixed by filtering these keys from SHOW CREATE TABLE output in newer runtimes.

  • You only see the failure on a subset of tables because SHOW CREATE TABLE only emits these row-tracking materialization keys for tables where the row-tracking metadata columns were actually materialized (for example, after enabling row tracking and certain operations); tables that have row tracking enabled but no materialized columns won’t show them, so their recreated DDL doesn’t trip the error.

Why DBR 17.3 “just works”

  • Newer DBR versions filter these internal row-tracking properties out of SHOW CREATE TABLE, so replayed DDL won’t include them and CREATE succeeds (as you observed). The intent of the fix was exactly to prevent these properties from appearing in SHOW CREATE output.

Workarounds on DBR 16.4.12 (Scala 2.12), without upgrading

Pick one of the following approaches:

  • Sanitize the SHOW CREATE TABLE output before replay:

    • Strip the two internal keys from the TBLPROPERTIES block:
      • delta.rowTracking.materializedRowCommitVersionColumnName
      • delta.rowTracking.materializedRowIdColumnName
    • Keep delta.enableRowTracking='true' and other supported properties; the internal names will be auto-generated by Delta when needed, you must not set them explicitly.
  • For external tables pointing to an existing Delta location, create the table without specifying any properties. Delta will honor the properties from the existing _delta_log at that location, avoiding the “properties mismatch” dance entirely.

  • If you hit “[DELTA_CREATE_TABLE_WITH_DIFFERENT_PROPERTY] … do not match existing properties” when recreating over an existing location:

    • Do not add the two internal row-tracking materialization keys (they’ll be rejected).
    • Either create as EXTERNAL without properties (and let the existing log win), or avoid specifying properties that differ from the location’s metadata, since DBR 16.x strictly compares them.

Notes and cautions:

  • The DELTA_UNKNOWN_CONFIGURATION error is a Class F0000 configuration-file error in the SQLSTATE catalog (that’s why it looks like a configuration rejection rather than a parsing error).
  • Don’t use spark.databricks.delta.allowArbitraryProperties.enabled=true to bypass this; that knob is discussed for a different config class and isn’t appropriate for internal row-tracking keys. Better to strip the keys than force acceptance of unsupported properties.
  • The internal keys are set automatically by Delta when row tracking is enabled and certain operations (such as Liquid clustering) require them; you shouldn’t manage them yourself.

Drop-in sanitizer you can put in your job

Here’s a small helper to remove the problematic keys from a SHOW CREATE TABLE statement before executing it:

import re

INTERNAL_RT_KEYS = [
    "delta.rowTracking.materializedRowCommitVersionColumnName",
    "delta.rowTracking.materializedRowIdColumnName",
]

def sanitize_show_create(ddl: str) -> str:
    # Remove lines inside TBLPROPERTIES that set either internal RT key
    lines = ddl.splitlines()
    cleaned = []
    for ln in lines:
        if "TBLPROPERTIES" in ln:
            cleaned.append(ln)
            continue
        # Drop any property assignment containing those keys
        if any(k in ln for k in INTERNAL_RT_KEYS):
            continue
        cleaned.append(ln)
    # Also handle cases where properties appear comma-separated on one line
    ddl_clean = "\n".join(cleaned)
    for k in INTERNAL_RT_KEYS:
        ddl_clean = re.sub(
            rf"(?i)\s*'{re.escape(k)}'\s*=\s*'[^']*'\s*,?\s*", 
            "", 
            ddl_clean
        )
    # Fix potential trailing commas before closing parenthesis
    ddl_clean = re.sub(r",\s*)", ")", ddl_clean)
    return ddl_clean
 
 

If you want to be extra-safe

  • Beyond the two row-tracking materialization keys, ensure your script does not attempt to set other delta-internal metadata keys that are not meant for CREATE (SHOW CREATE in 16.x generally doesn’t emit them, but filtering by a denylist keeps you robust). The confirmed denylist for this issue remains the two row-tracking keys above.

Summary

  • Root cause: SHOW CREATE TABLE on DBR 16.4 sometimes emits internal row-tracking materialization properties that are rejected at CREATE; this was fixed later by filtering them out.
  • Best workaround on 16.4: strip those two properties from the DDL before replay, or create external tables without properties over existing locations.

Hope this helps, Louis.

View solution in original post

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Happy Monday @DarioB , I did some digging and would like to provide you with some helpful hints/tips.

Thanks for the detailed context—this is a known rough edge in DBR 16.x when recreating tables that have row tracking materialized.

What’s happening and why it only fails for some tables

  • The two keys you see in SHOW CREATE TABLE—delta.rowTracking.materializedRowCommitVersionColumnName and delta.rowTracking.materializedRowIdColumnName—are internal, auto-generated names of the materialized metadata columns used by the row tracking feature. They are not user-settable and are rejected during CREATE/REPLACE, hence the DELTA_UNKNOWN_CONFIGURATION error.

  • In DBR 16.x, SHOW CREATE TABLE sometimes includes these internal keys in the TBLPROPERTIES output, which tempts scripts to replay them into a CREATE TABLE. That is a bug that the Delta team tracked and fixed by filtering these keys from SHOW CREATE TABLE output in newer runtimes.

  • You only see the failure on a subset of tables because SHOW CREATE TABLE only emits these row-tracking materialization keys for tables where the row-tracking metadata columns were actually materialized (for example, after enabling row tracking and certain operations); tables that have row tracking enabled but no materialized columns won’t show them, so their recreated DDL doesn’t trip the error.

Why DBR 17.3 “just works”

  • Newer DBR versions filter these internal row-tracking properties out of SHOW CREATE TABLE, so replayed DDL won’t include them and CREATE succeeds (as you observed). The intent of the fix was exactly to prevent these properties from appearing in SHOW CREATE output.

Workarounds on DBR 16.4.12 (Scala 2.12), without upgrading

Pick one of the following approaches:

  • Sanitize the SHOW CREATE TABLE output before replay:

    • Strip the two internal keys from the TBLPROPERTIES block:
      • delta.rowTracking.materializedRowCommitVersionColumnName
      • delta.rowTracking.materializedRowIdColumnName
    • Keep delta.enableRowTracking='true' and other supported properties; the internal names will be auto-generated by Delta when needed, you must not set them explicitly.
  • For external tables pointing to an existing Delta location, create the table without specifying any properties. Delta will honor the properties from the existing _delta_log at that location, avoiding the “properties mismatch” dance entirely.

  • If you hit “[DELTA_CREATE_TABLE_WITH_DIFFERENT_PROPERTY] … do not match existing properties” when recreating over an existing location:

    • Do not add the two internal row-tracking materialization keys (they’ll be rejected).
    • Either create as EXTERNAL without properties (and let the existing log win), or avoid specifying properties that differ from the location’s metadata, since DBR 16.x strictly compares them.

Notes and cautions:

  • The DELTA_UNKNOWN_CONFIGURATION error is a Class F0000 configuration-file error in the SQLSTATE catalog (that’s why it looks like a configuration rejection rather than a parsing error).
  • Don’t use spark.databricks.delta.allowArbitraryProperties.enabled=true to bypass this; that knob is discussed for a different config class and isn’t appropriate for internal row-tracking keys. Better to strip the keys than force acceptance of unsupported properties.
  • The internal keys are set automatically by Delta when row tracking is enabled and certain operations (such as Liquid clustering) require them; you shouldn’t manage them yourself.

Drop-in sanitizer you can put in your job

Here’s a small helper to remove the problematic keys from a SHOW CREATE TABLE statement before executing it:

import re

INTERNAL_RT_KEYS = [
    "delta.rowTracking.materializedRowCommitVersionColumnName",
    "delta.rowTracking.materializedRowIdColumnName",
]

def sanitize_show_create(ddl: str) -> str:
    # Remove lines inside TBLPROPERTIES that set either internal RT key
    lines = ddl.splitlines()
    cleaned = []
    for ln in lines:
        if "TBLPROPERTIES" in ln:
            cleaned.append(ln)
            continue
        # Drop any property assignment containing those keys
        if any(k in ln for k in INTERNAL_RT_KEYS):
            continue
        cleaned.append(ln)
    # Also handle cases where properties appear comma-separated on one line
    ddl_clean = "\n".join(cleaned)
    for k in INTERNAL_RT_KEYS:
        ddl_clean = re.sub(
            rf"(?i)\s*'{re.escape(k)}'\s*=\s*'[^']*'\s*,?\s*", 
            "", 
            ddl_clean
        )
    # Fix potential trailing commas before closing parenthesis
    ddl_clean = re.sub(r",\s*)", ")", ddl_clean)
    return ddl_clean
 
 

If you want to be extra-safe

  • Beyond the two row-tracking materialization keys, ensure your script does not attempt to set other delta-internal metadata keys that are not meant for CREATE (SHOW CREATE in 16.x generally doesn’t emit them, but filtering by a denylist keeps you robust). The confirmed denylist for this issue remains the two row-tracking keys above.

Summary

  • Root cause: SHOW CREATE TABLE on DBR 16.4 sometimes emits internal row-tracking materialization properties that are rejected at CREATE; this was fixed later by filtering them out.
  • Best workaround on 16.4: strip those two properties from the DDL before replay, or create external tables without properties over existing locations.

Hope this helps, Louis.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now