cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta lake Vs Data lake in Databricksย Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data La...

Aviral-Bhardwaj
Esteemed Contributor III

Delta lake Vs Data lake in Databricks

Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data Lake Store or Amazon S3. It provides a more robust and scalable alternative to traditional data lake storage, which is often prone to data inconsistencies and corruption.

Delta Lake offers the following benefits over traditional data lake storage:

๐Ÿ˜Š ACID transactions: Delta Lake supports ACID (Atomicity, Consistency, Isolation, Durability) transactions, which allow multiple users to read and write to the data lake concurrently without conflicting with each other. This helps ensure that data remains consistent and accurate, even when multiple users are writing to the same data.

๐Ÿ˜Š Versioning: Delta Lake automatically tracks changes to data and maintains a history of all changes, allowing you to roll back to a previous version if necessary.

๐Ÿ˜Š Time travel: Delta Lake allows you to query data as it existed at any point in time, making it easy to see how data has changed over time.

๐Ÿ˜Š Data quality checks: Delta Lake includes built-in data quality checks that can help detect and fix issues with data, such as null values or data type mismatches.

๐Ÿ˜ฑ While Delta Lake is integrated into Databricks, a cloud-based data analytics platform that provides a collaborative workspace for data scientists and analysts to build, test, and deploy data pipelines and models. Delta Lake is natively supported in Databricks, making it easy to use and integrate with other Databricks features.

In summary, Delta Lake is a storage layer that sits on top of traditional data lake storage and provides additional features and capabilities for data management, such as ACID transactions, versioning, and data quality checks. It is natively supported in Databricks, making it easy to use and integrate with other Databricks features.

If you think this is good post please hit the like button and follow me here

Thanks

Aviral Bhardwaj

AviralBhardwaj
2 REPLIES 2

Ajay-Pandey
Esteemed Contributor III

Thanks @Aviral Bhardwajโ€‹ for sharing

Ajay Kumar Pandey

Meghala
Valued Contributor II

this data is very much informative and i understood much in it so thank you @Aviral Bhardwajโ€‹ sir

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group