Lu_Wang_ENB_DBX
Databricks Employee
Databricks Employee

Here are the answers to your questions:

  1. Is Delta’s restriction a design decision?
    Yes. In Delta, mergeSchema is mainly for schema evolution by adding columns; type changes are still controlled by schema enforcement unless the change qualifies for type widening. If the mismatch does not meet type-widening conditions, Delta follows normal enforcement rules instead of silently changing the column type.

  2. Can append mode handle type changes without pre-casting or full overwrite?
    Yes, but only for supported widening changes such as INT -> BIGINT, and only when the target table has delta.enableTypeWidening = true and schema evolution is enabled on the write.
    For INT -> STRING, that is not a supported automatic widening path; docs explicitly call it an unsupported data type change in Auto Loader, where it gets rescued instead of widened.

  3. What Git repo code result shows the intended pattern?
    A GitHub code file in databrickslabs/lakebridge uses the exact Delta pattern of enabling type widening first, then altering column types:

sqls: list | None = [
  f"ALTER TABLE {table_identifier} SET TBLPROPERTIES ('delta.enableTypeWidening' = 'true')",
  f"ALTER TABLE {table_identifier} ALTER COLUMN recon_metrics.row_comparison.missing_in_source TYPE BIGINT",
  f"ALTER TABLE {table_identifier} ALTER COLUMN recon_metrics.row_comparison.missing_in_target TYPE BIGINT",
]
  1. What is the recommended production approach for dynamic schema drift?
    Use a Bronze/Silver pattern with Auto Loader. By default, Auto Loader is designed to avoid breaking on type mismatches: for text formats it infers columns as STRING, and with rescue modes it places unsupported type-change values into the rescued data column instead of failing the pipeline.
    If you want automatic widening for compatible changes, use addNewColumnsWithTypeWidening plus delta.enableTypeWidening=true; unsupported changes like INT -> STRING should be rescued/quarantined and normalized downstream rather than forced into the Delta target schema during append.

  2. Summary

  • New columns → use mergeSchema.
  • Widenable type changes → enable type widening and keep append mode.
  • Non-widening changes like INT -> STRING → do not rely on Delta mergeSchema; land raw data, rescue the bad values, and reconcile/cast in a downstream layer, or explicitly alter/overwrite the table schema when you choose to accept the change.