cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What is the best way to explain the difference between a data warehouse and a data

SigmaTSE
New Contributor

What is the best way to explain to non-technical customers the difference between a data warehouse and a data lakehouse?

4 REPLIES 4

yogu
Honored Contributor III
  • A data warehouse is like a structured library, offering organized data for specific purposes like reporting and analysis.
  • A data lakehouse is like a flexible reservoir, capable of holding diverse data in its raw form, allowing for more exploratory and adaptable analysis.

 

Vinay_M_R
Databricks Employee
Databricks Employee

A data warehouse is analogous to a well-organized and structured library. It is intended to store and organise data in a structured fashion, similar to how books are organised on shelves by specific categories or genres. Data is meticulously structured, processed, and organised in advance in a data warehouse to maintain consistency and to promote efficient querying and analysis. It's similar to having a catalogue that allows you to easily find the information you require.

A data lakehouse, on the other hand, is more akin to a large reservoir or lake where you can store any form of data without regard for structure. Consider pouring various forms of information into a vast lake, such as documents, photographs, audio files, and so on. A lakehouse stores data in its raw form, with no preconceived structure or organisation. It gives a location to store enormous amounts of diverse data as it arrives, without having to think about how it will be used afterwards.

So in short:

A data warehouse is organised, structured, and optimised for easy access and analysis. It's similar to a well-organized library where you can simply discover the information you're looking for.

A data lakehouse is more like to a large reservoir that may store many sorts of data in their raw form, allowing for further analysis or processing.

pvignesh92
Honored Contributor

Hi @SigmaTSE ,

In simple words,

  • Data Warehouse -> You store data in structured tables to aid your Reporting or BI queries. Better performance for your queries as the data is internally optimized for faster queries. 
  • Data Lake -> You can store data in any format and it can scale massively. Integrates well with your ETL tool to read and process your data for different use cases. Not so efficient in query performance like your data warehouse. 
  • Data Lakehouse -> Combination of features of Data Lake + Data Warehouse. Supports massive scale, complex ETL processing as well as milli second query latency. So you get the best of both worlds.

Hope this helps. 

Vinayak_rescuer
New Contributor II

Hi @SigmaTSE ,

Data warehouse is like an organized pantry with labeled jars of ingredients.

Data lakehouse is like a magical fridge where you can store all kinds of ingredients without worrying about labels, allowing for more flexibility and creativity.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group