- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-28-2023 03:24 PM
The below code is a solution. I was missing that I could read from a table with `spark.readStream.format("delta").table("...")`. Simple. Just missed it. This is different than `dlt.read_stream()` which appears in the examples a lot.
This is referenced as an example in the docs on CDC: https://docs.databricks.com/delta-live-tables/cdc.html.
import dlt
@dlt.table(
table_properties = {"quality" : "silver"}
)
def silver_1():
# Read the changes as a stream from the table
df = spark.readStream.format("delta").table("hive_metastore.dev.bronze_raw")
# Return the entire dataframe with all columns
return dfReading from a table like this is not explicitly given as an example in the Python ref: https://docs.databricks.com/delta-live-tables/python-ref.html. I think that making this an example in a section called "Reading from sources" with examples on how to read in various ways would save people some time. I will send some feedback on that.