Databricks Community

ag2all · ‎07-31-2024

Trying to design a Lakehouse. Spark is at the base layer. Now wondering if adding Apache Iceberg sitting below Spark will be of help, or, not ? Preferring Iceberg for its auto indexing, ACID query facilities over big hetergenous datasets. Wonder if its a wise choice??

holly · ‎08-06-2024

Hello, if you're planning on building your own open source stack of spark+iceberg, it can be a good choice.

If you're on Databricks, however, you're going to miss out a *lot* on delta features that are baked into the platform. Specifically compute + storage performance based optimisations and UC integrations. Delta has ACID compliance, works beautifully with large datasets and you have many performance choices with liquid clustering or legacy z ordering.

If you're integrating with other systems that are only iceberg compatible, check out uniform to write out additional metadata so other systems can read from it: https://docs.databricks.com/en/delta/uniform.html

Databricks Community

Databricks + Apache Iceberg = advantageous or wasted effort due to duplicate functionality ?

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!