cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta lake Vs Data lake in Databricksย Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data La...

Aviral-Bhardwaj
Esteemed Contributor III

Delta lake Vs Data lake in Databricks

Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data Lake Store or Amazon S3. It provides a more robust and scalable alternative to traditional data lake storage, which is often prone to data inconsistencies and corruption.

Delta Lake offers the following benefits over traditional data lake storage:

๐Ÿ˜Š ACID transactions: Delta Lake supports ACID (Atomicity, Consistency, Isolation, Durability) transactions, which allow multiple users to read and write to the data lake concurrently without conflicting with each other. This helps ensure that data remains consistent and accurate, even when multiple users are writing to the same data.

๐Ÿ˜Š Versioning: Delta Lake automatically tracks changes to data and maintains a history of all changes, allowing you to roll back to a previous version if necessary.

๐Ÿ˜Š Time travel: Delta Lake allows you to query data as it existed at any point in time, making it easy to see how data has changed over time.

๐Ÿ˜Š Data quality checks: Delta Lake includes built-in data quality checks that can help detect and fix issues with data, such as null values or data type mismatches.

๐Ÿ˜ฑ While Delta Lake is integrated into Databricks, a cloud-based data analytics platform that provides a collaborative workspace for data scientists and analysts to build, test, and deploy data pipelines and models. Delta Lake is natively supported in Databricks, making it easy to use and integrate with other Databricks features.

In summary, Delta Lake is a storage layer that sits on top of traditional data lake storage and provides additional features and capabilities for data management, such as ACID transactions, versioning, and data quality checks. It is natively supported in Databricks, making it easy to use and integrate with other Databricks features.

If you think this is good post please hit the like button and follow me here

Thanks

Aviral Bhardwaj

2 REPLIES 2

Ajay-Pandey
Esteemed Contributor III

Thanks @Aviral Bhardwajโ€‹ for sharing

Meghala
Valued Contributor II

this data is very much informative and i understood much in it so thank you @Aviral Bhardwajโ€‹ sir

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.