cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Deltalkake vs Delta table

Krish1
New Contributor II

Can somebody give me good definition of delta lake vs delta table? What are the use cases of each, similarities and differences? Sorry Iโ€™m new to databricks ans trying to learn.

2 REPLIES 2

Rishabh-Pandey
Esteemed Contributor

Delta Lake is an open-source storage layer that is designed to bring reliability to data lakes. It is built on top of Apache Spark and provides features such as ACID transactions, schema enforcement, and time travel. Delta Lake is essentially a storage format that provides a set of features for managing data in a data lake environment.

Delta tables, on the other hand, are tables that are created using the Delta Lake storage format. Delta tables are optimized for use in data lake environments and provide features such as ACID transactions, schema enforcement, and time travel. Delta tables are essentially a specific type of table that is built on top of the Delta Lake storage format.

In summary, Delta Lake is a storage layer that provides features for managing data in a data lake environment, while Delta tables are tables that are built on top of the Delta Lake storage format and provide optimized features for working with data in a data lake environment.

Rishabh Pandey

Annapurna_Hiriy
Databricks Employee
Databricks Employee

Delta Lake and Delta table are related concepts in the Apache Delta Lake project. which extends Apache Spark with ACID (Atomicity, Consistency, Isolation, Durability) capabilities for data lakes.

Delta Lake provides a storage layer that enables transactional and scalable data processing on top of cloud storage systems like Hadoop Distributed File System (HDFS)/Amazon S3/ADLS.

Reference: https://docs.delta.io/latest/delta-intro.html

A Delta table is a collection of data organized in a tabular format within Delta Lake. It represents a table structure with schema and associated data stored in a Delta Lake format. There are 2 types of delta tables

  1. Managed table
  2. Unmanaged table

Please refer to the following document for more information about managed and unmanaged delta tables:

https://docs.databricks.com/lakehouse/data-objects.html#managed-table

Key features of Delta Lake and Delta tables are the same and they include:

ACID transactions

Schema enforcement and evolution

Time travel

Data reliability

Metadata management

In summary, Delta Lake is the underlying storage layer that provides transactional and reliability features, while Delta tables represent the tabular structures within Delta Lake, offering ACID properties, schema enforcement, versioning, and other Delta Lake capabilities. Delta tables are the primary means of working with structured data in Delta Lake.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group