- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2025 10:48 PM - edited 07-21-2025 10:55 PM
Hi @noorbasha534 ,
That’s a really cool idea and definitely shows initiative - but realistically, it might not be worth the effort. There’s a lot of engineering going on under the hood that would be tough to replicate in-house.
Collecting telemetry and using it for things like liquid clustering and stats gathering could work to some extent, but the effort required to build and maintain something similar would likely outweigh the benefits, especially given how deeply integrated and optimized the native solution is.
If you have external tables I would just take care of regular maintenance of the tables (etc. like running optimize/ vacuum regulary).
Would be awesome if Databricks open-sourced it, though - totally agree with you there.