Friday
We are running a Deep Clone script to copy Catalogs between Environments; this script is run through a job (run by SP) with DBR 16.4.12.
Some tables are Deep Cloned and other ones are Dropped and Recreated to load partial data. The ones dropped are recreated through a "SHOW CREATE TABLE" result. The issue comes with certain tables with enableRowTracking that it retrieves both 'delta.rowTracking.materializedRowCommitVersionColumnName' and 'delta.rowTracking.materializedRowIdColumnName' which are table properties not allowed in creation, which results in:
[DELTA_UNKNOWN_CONFIGURATION] Unknown configuration was specified: delta.rowTracking.materializedRowCommitVersionColumnName
We have most of the tables with enableRowTracking activated and it does fails only with some of thems. I attach here the CREATE statement of a failing table:
CREATE TABLE Catalog.Schema.Table (
XXX STRING,
XXX STRING,
XXX TIMESTAMP,
XXX STRING,
XXX STRING,
XXX STRING,
XXX INT,
XXX DATE,
XXX STRING,
XXX STRING,
XXX STRING,
XXX STRING,
XXX TIMESTAMP)
USING delta
TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.enableChangeDataFeed' = 'true',
'delta.enableDeletionVectors' = 'true',
'delta.enableRowTracking' = 'true',
'delta.feature.appendOnly' = 'supported',
'delta.feature.changeDataFeed' = 'supported',
'delta.feature.columnMapping' = 'supported',
'delta.feature.deletionVectors' = 'supported',
'delta.feature.domainMetadata' = 'supported',
'delta.feature.invariants' = 'supported',
'delta.feature.rowTracking' = 'supported',
'delta.minReaderVersion' = '3',
'delta.minWriterVersion' = '7',
'delta.rowTracking.materializedRowCommitVersionColumnName' = '_row-commit-version-col-UID',
'delta.rowTracking.materializedRowIdColumnName' = '_row-id-col-UID')It's true that we have tested with DBR 17.3 and it works, since it removes those properties; but we are not yet ready to migrate to Scala 2.13 and thus we want to understand if there is some interaction into the Delta Tables features that forces this error and how to avoid.
53m ago
Happy Monday @DarioB , I did some digging and would like to provide you with some helpful hints/tips.
Thanks for the detailed context—this is a known rough edge in DBR 16.x when recreating tables that have row tracking materialized.
The two keys you see in SHOW CREATE TABLE—delta.rowTracking.materializedRowCommitVersionColumnName and delta.rowTracking.materializedRowIdColumnName—are internal, auto-generated names of the materialized metadata columns used by the row tracking feature. They are not user-settable and are rejected during CREATE/REPLACE, hence the DELTA_UNKNOWN_CONFIGURATION error.
In DBR 16.x, SHOW CREATE TABLE sometimes includes these internal keys in the TBLPROPERTIES output, which tempts scripts to replay them into a CREATE TABLE. That is a bug that the Delta team tracked and fixed by filtering these keys from SHOW CREATE TABLE output in newer runtimes.
You only see the failure on a subset of tables because SHOW CREATE TABLE only emits these row-tracking materialization keys for tables where the row-tracking metadata columns were actually materialized (for example, after enabling row tracking and certain operations); tables that have row tracking enabled but no materialized columns won’t show them, so their recreated DDL doesn’t trip the error.
Pick one of the following approaches:
Sanitize the SHOW CREATE TABLE output before replay:
For external tables pointing to an existing Delta location, create the table without specifying any properties. Delta will honor the properties from the existing _delta_log at that location, avoiding the “properties mismatch” dance entirely.
If you hit “[DELTA_CREATE_TABLE_WITH_DIFFERENT_PROPERTY] … do not match existing properties” when recreating over an existing location:
Notes and cautions:
Here’s a small helper to remove the problematic keys from a SHOW CREATE TABLE statement before executing it:
import re
INTERNAL_RT_KEYS = [
"delta.rowTracking.materializedRowCommitVersionColumnName",
"delta.rowTracking.materializedRowIdColumnName",
]
def sanitize_show_create(ddl: str) -> str:
# Remove lines inside TBLPROPERTIES that set either internal RT key
lines = ddl.splitlines()
cleaned = []
for ln in lines:
if "TBLPROPERTIES" in ln:
cleaned.append(ln)
continue
# Drop any property assignment containing those keys
if any(k in ln for k in INTERNAL_RT_KEYS):
continue
cleaned.append(ln)
# Also handle cases where properties appear comma-separated on one line
ddl_clean = "\n".join(cleaned)
for k in INTERNAL_RT_KEYS:
ddl_clean = re.sub(
rf"(?i)\s*'{re.escape(k)}'\s*=\s*'[^']*'\s*,?\s*",
"",
ddl_clean
)
# Fix potential trailing commas before closing parenthesis
ddl_clean = re.sub(r",\s*)", ")", ddl_clean)
return ddl_clean
Hope this helps, Louis.
53m ago
Happy Monday @DarioB , I did some digging and would like to provide you with some helpful hints/tips.
Thanks for the detailed context—this is a known rough edge in DBR 16.x when recreating tables that have row tracking materialized.
The two keys you see in SHOW CREATE TABLE—delta.rowTracking.materializedRowCommitVersionColumnName and delta.rowTracking.materializedRowIdColumnName—are internal, auto-generated names of the materialized metadata columns used by the row tracking feature. They are not user-settable and are rejected during CREATE/REPLACE, hence the DELTA_UNKNOWN_CONFIGURATION error.
In DBR 16.x, SHOW CREATE TABLE sometimes includes these internal keys in the TBLPROPERTIES output, which tempts scripts to replay them into a CREATE TABLE. That is a bug that the Delta team tracked and fixed by filtering these keys from SHOW CREATE TABLE output in newer runtimes.
You only see the failure on a subset of tables because SHOW CREATE TABLE only emits these row-tracking materialization keys for tables where the row-tracking metadata columns were actually materialized (for example, after enabling row tracking and certain operations); tables that have row tracking enabled but no materialized columns won’t show them, so their recreated DDL doesn’t trip the error.
Pick one of the following approaches:
Sanitize the SHOW CREATE TABLE output before replay:
For external tables pointing to an existing Delta location, create the table without specifying any properties. Delta will honor the properties from the existing _delta_log at that location, avoiding the “properties mismatch” dance entirely.
If you hit “[DELTA_CREATE_TABLE_WITH_DIFFERENT_PROPERTY] … do not match existing properties” when recreating over an existing location:
Notes and cautions:
Here’s a small helper to remove the problematic keys from a SHOW CREATE TABLE statement before executing it:
import re
INTERNAL_RT_KEYS = [
"delta.rowTracking.materializedRowCommitVersionColumnName",
"delta.rowTracking.materializedRowIdColumnName",
]
def sanitize_show_create(ddl: str) -> str:
# Remove lines inside TBLPROPERTIES that set either internal RT key
lines = ddl.splitlines()
cleaned = []
for ln in lines:
if "TBLPROPERTIES" in ln:
cleaned.append(ln)
continue
# Drop any property assignment containing those keys
if any(k in ln for k in INTERNAL_RT_KEYS):
continue
cleaned.append(ln)
# Also handle cases where properties appear comma-separated on one line
ddl_clean = "\n".join(cleaned)
for k in INTERNAL_RT_KEYS:
ddl_clean = re.sub(
rf"(?i)\s*'{re.escape(k)}'\s*=\s*'[^']*'\s*,?\s*",
"",
ddl_clean
)
# Fix potential trailing commas before closing parenthesis
ddl_clean = re.sub(r",\s*)", ")", ddl_clean)
return ddl_clean
Hope this helps, Louis.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now