cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Live Table (Real Time Usage & Application)

Sourav7890
New Contributor III

Delta Live Tables are the Hot Topic in Data Field, innovation by Databricks. Delta Live Table is a Declarative ETL framework. In ETL two types of ETL frame works are there -
1) procedure ETL 2)Declarative ETL
1)procedure ETL- it involves writing code that explicitly outlines the steps to transform data from source to target. It is a more hands-on approach that requires developers to define each steps of the ETL process. Example-Informatica, Talend, SSIS.
2)Declarative ETL- Declarative ETL is a more abstract approach that focus on defining the desired outcome of the ETL process. In declarative ETL the developer defines the desired end state of the ETL tool automatically generates the code to transform the data into end state. Example-ADF,Aws Glue, DLT.

Main advantage of DLT-
1) version control.
2) Deployment.
3)Data Quality checks.
4)Governance.
5)Delta engine automatically handles the complex task of Data ingestion,Data merging, Schema evaluation also.
6)Use Auto Loader and streaming tables to incrementally land data into the Bronze layer for DLT pipelines or Databricks SQL queries.

In DLT two env modes support -1) Development 2)Production
DLT Pipeline Refresh modes- 1)Continuous 2)Triggered

If the pipeline uses the triggered execution mode, the system stops processing after successfully refreshing all tables or selected tables in the pipeline once, ensuring each table that is part of the update is updated based on the data available when the update started.
If the pipeline uses continuous execution, Delta Live Tables processes new data as it arrives in data sources to keep tables throughout the pipeline fresh.

Here in the below example created one sample DLT Pipeline notebook and it is following the medallion architecture-
1)Ingesting the data into bronze layer (using the autoloader, csv & json file ingestion).
2)Ingesting the data into Silver Layer from Bronze Layer and check constraints to check the data quality and Data cleaning and transformation.
3)Gold layer table preparation and more refined data and sharing to the Bi & ML team.

Disadvantage - DLT supports all tables in one schema/db, suppose you want to create the bronze,silver, golden layer tables in diff schema/db , you cann't implement this thing by DLT. all bronze, silver and golden layer tables you have to create in one db/schema.

Sourav Das
0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group