Can someone explain what a lakehouse is and how it relates to Delta Lake?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2022 03:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2022 04:22 AM
A lakehouse is an architectural design to build a data warehouse using data lake/big data tools.
So it takes elements from classical data warehousing and skips the whole RDBMS part and inserts other elements like HDFS etc.
So think of it as a data warehouse with modern scale out tools.
Delta lake is a file format which will contain the actual data (think of it as a column store table) based on the Parquet file format, but with added functionality (it is ACID compliant).
How do the two relate?
Well, Delta Lake fits quite well into a lakehouse, especially because delta lake has upsert (MERGE) functionality (that most other file formats lack), change data feed, clones, rollback etc.
It also has several performance enhancing functions which makes it a very suitable file format to build your lakehouse upon.
So when you build a lakehouse, delta lake will probably be part of it (together with parquet and maybe some others).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2022 08:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2022 11:22 PM
Hi @Soham Dhodapkar https://docs.databricks.com/lakehouse/index.html This document depicts the component of the lakehouse as described in the image shared by @Hubert Dudek .

