Hi Toby,
Thanks for your reply! Much appreciated and interesting to hear your view.
In terms of streaming to bronze without CDC. It would essentially be an append-stream, right? What about if you have a dimension table that processes delta data and you want to update the state of the table, incrementally. If one would append-only, it could be a bit clunky, as it it would be harder to query that bronze table's latest state of data (without using CDC). Hence, wouldn't it be preferable to be able to maintain an easily accessible complete latest state of the bronze layer too (and not an append-only bronze table)?
Thinking on the gold layer:
1. If you would use foreachBatch(), it would be done outside of the DLT pipeline, if I am not wrong? Essentially run as a separate Databricks job. I think it would be more efficient to aim to rely on the same technology approach (DLT) through out if possible I'd say.
2. Materialised views are a bit limited in its support currently. E.g., requires Unity Catalog, intended for Databricks managed queries, not supporting surrogate keys which can be nice for reporting, seems to be only for SQL currently.
What are you thoughts?
Best regards,
Andreas