Databricks Community

gerard_gv · ‎09-10-2025

I have been some days trying to find the equivalent SQL for:

spark.readStream
        .option("readChangeFeed", "true")
        .table("table_name")

I suspect that it works like AUTO CDC FROM SNAPSHOT, since CDF adds the column "_commit_version", a bigint field that might be used as a snapshot ID behind the scenes, which could explain why there is no direct SQL equivalent since the documentation says "Use AUTO CDC FROM SNAPSHOT (Public Preview, and only available for Python) to process changes in database snapshots". Documentation: https://docs.databricks.com/aws/en/dlt/cdc

I also tried to check the difference between the table properties of a streaming table created with the following SQL and one created with Python using readChangeFeed = true. However, the table created with Python only has pipelineId in its properties.

CREATE OR REPLACE STREAMING TABLE
    target_table_name
AS SELECT
    *
FROM
    STREAM(table_name)

Thanks in advance😓

Louis_Frolio · ‎09-16-2025

Greetings @gerard_gv , there isn’t currently a direct SQL equivalent to the readChangeFeed option. This option is only supported in streaming through the Python and Scala DataFrame APIs.

As a workaround, take a look at the table_changes SQL function. For example:

-- Read changes between specific versions

SELECT * FROM table_changes('table_name', 5, 10);

-- Read changes starting from a timestamp

SELECT * FROM table_changes('table_name', '2023-01-01T00:00:00.000Z');

It is not exactly what you are looking for but it may be a viable work around.

I hope this helps, Louis.

View solution in original post

Louis_Frolio · ‎09-16-2025