@guidotognini
1. Can You Avoid Materializing Exploded Data?
Materialized Views: If your downstream silver table is itself streaming and supports materialized views, you may be able to collapse the explode+normalize step into a view that directly transforms bronze to silver without a separate persisted table
DLT with Python or SQL can chain transformations without persisting each intermediate to storage. However, skipping materialization risks incomplete lineage, difficult debugging, and slower downstream query performance
One can push explode+normalize directly into the silver table transformation, with CDC logic applied in a single step. This reduces the number of tables, but may increase complexity and reduce auditability; if you ever need to reprocess or troubleshoot, having the exploded step materialized helps
Suggested actions:
- Materialize the exploded/normalized step as a managed silver staging table if reliability, traceability, and modularity are priorities.
- If your pipeline is simple, rarely needs isolated queries on the exploded array, and you don't require stepwise time-travel/debugging, consider using a view or merging normalization into the CDC silver step.
- Use Medallion naming conventions that clearly indicate the purpose (e.g., "silver_staging" or "silver_normalized").
- Document lineage to ensure that downstream consumers understand which objects are audited system of record (bronze) and which are computable/corrected intermediates (silver)https://www.linkedin.com/pulse/flattening-json-data-databricks-downstream-processing-avinash-narala-...
https://tecyfy.com/blog/medallion-architecture-best-practices-databricks-technical-deep-dive