Hi everyone,
I recently took a look into a silent cost driver in many data platforms: the default choice between managed and external tables in Unity Catalog.
It is very common for teams to default to external tables, but this choice often leads to accumulating orphaned files (a "Storage Tax") and pushes compaction overhead onto your compute clusters, as Predictive Optimization only applies to managed tables.
I wrote a technical deep dive on the hidden costs of external tables and why we should be defaulting to managed storage in most cases.
You can read the full breakdown here: Managed vs External Tables in Unity Catalog: The Decision Thatโs Silently Inflating Your Cloud Bill.
I'd love to hear how others in the community handle table architecture, and whether your teams default to managed or external tables when building out your lakehouse!