Your design shows strong alignment with the Medallion Architecture principles and addresses schema variability well, but there are some scalability and governance considerations worth discussing. Also Pre-Bronze, Building a schema registry early is excellent for lineage and governance. It will help downstream processes know the maximal schema and track evolution.
Some of the potential challenges that I see are:
- A single massive ingest table with tens of thousands of columns will be hard to manage and query.
- Performance and storage overhead could become significant, especially if schema evolution adds sparse columns over time.
- DLT can handle large pipelines, but tens of thousands of tables in one pipeline may hit operational limits (job orchestration, monitoring, debugging).
- If Bronze is one giant table, lineage from source → Silver → Gold becomes opaque.
Some adjustments that I can see is
- Partition Bronze by Facility or Logical Domain. Instead of one giant ingest table, create multiple Bronze tables grouped by facility or domain.
- Expectations & Quality Rules Early. Apply basic expectations in Bronze (e.g., non-null keys) to catch bad data before Silver.