CDF metadata columns are lost after importing dlt

Ru · ‎05-09-2025

Hi Databricks Community,

I attempted to read the Change Feed from a CDF-enabled table. Initially, the correct table schema, including the metadata columns (_change_type, _commit_version, and _commit_timestamp), was returned as expected. However, after importing the dlt library and reading the changes again, the metadata columns were missing. Could you help me resolve this issue? Thank you in advance!

# Databricks notebook source
changeset_cols_before = (
    spark.read
    .option("readChangeFeed", "true")
    .option("startingVersion", 0)
    .table("<path_of_CDF_enabled_table>")
    .columns
)

# COMMAND ----------

import dlt

# COMMAND ----------

changeset_cols_after = (
    spark.read
    .option("readChangeFeed", "true")
    .option("startingVersion", 0)
    .table("<path_of_CDF_enabled_table>")
    .columns
)

# COMMAND ----------

missing_cols = [col for col in changeset_cols_before if col not in changeset_cols_after]
print(missing_cols)

# result: ['_change_type', '_commit_version', '_commit_timestamp']