cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

readStream with readChangeFeed option in SQL

gerard_gv
New Contributor

I have been some days trying to find the equivalent SQL for: 

 

spark.readStream
        .option("readChangeFeed", "true")
        .table("table_name")

 

I suspect that it works like AUTO CDC FROM SNAPSHOT, since CDF adds the column "_commit_version", a bigint field that might be used as a snapshot ID behind the scenes, which could explain why there is no direct SQL equivalent since the documentation says "Use AUTO CDC FROM SNAPSHOT (Public Preview, and only available for Python) to process changes in database snapshots". Documentation: https://docs.databricks.com/aws/en/dlt/cdc

I also tried to check the difference between the table properties of a streaming table created with the following SQL and one created with Python using readChangeFeed = true. However, the table created with Python only has pipelineId in its properties.

 

CREATE OR REPLACE STREAMING TABLE
    target_table_name
AS SELECT
    *
FROM
    STREAM(table_name)

Thanks in advance๐Ÿ˜“

 

0 REPLIES 0