cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What are the advantages of using Delta Live tables (DLT) over Data Build Tool (dbt) in Databricks?

Prachi_Sankhala
New Contributor

Please explain with some use cases which show the difference between DLT and dbt.

1 ACCEPTED SOLUTION

Accepted Solutions

Priyag1
Honored Contributor II

@Prachi Sankhalaโ€‹ These are some of the use cases

View solution in original post

7 REPLIES 7

Priyag1
Honored Contributor II

Data build tool (dbt) is a transformation tool that aims to simplify the work of the analytic engineer in the data pipeline workflow. It specifically implements only the Transformation in the ETL process.

whereas Delta Live Tables (DLT) is a framework that makes it easier to design data pipelines and control the data quality. It covers the whole ETL process and is integrated in Databricks

Priyag1
Honored Contributor II

Use Cases :

Data lineage graph

To make data teams more efficient, a data lineage graph can be used to find problems in the data easier and faster. It also makes it simpler for new members of the team, data analysts or other colleagues to understand the data pipeline.

  • dbt

The data lineage graph includes the source table in the data warehouse, the tables after the different transformations and the dashboard in which the business value of the tables is displayed. It is however not possible to hover above or click on the tables and see more information.

  • Delta Live Tables

The data lineage graph shows the tables that load data from the data lake and the different tables after transformation. More information about the different tables can be obtained by clicking on the tables.

  • Comparison

Both show the tables of the sources and the transformations, but only dbt also shows what the end point of the data is, either if this is a dashboard, application or a data science pipeline.

Priyag1
Honored Contributor II

Incremental tables

In big data, the amount of data is so vast that it is impossible to load all the data every time a few rows were added, the delays would be enormous. To solve this, incremental tables only loads those extra rows. The transformation that first took a few hours drops now down to a few seconds.

  • dbt

An incremental model creates the whole table the first time it is run and then adapts the SQL code in an incremental run to incrementally transform the data

  • Delta Live Tables

The incrementally transforming and loading of the data from a data lake to the data warehouse is both possible.

  • Comparison

Both can incrementally transform data, but Delta Live Tables can also incrementally load data.

Priyag1
Honored Contributor II

Detailed logging

For the engineers that maintain the different infrastructures and connections, logging is very important to pinpoint what the error was and more importantly, where a certain error took place. With logging, one can determine exactly when something went wrong very efficiently and action can immediately take place. This way a production line does not experience much or any downtime.

  • dbt

The logs are generated from running a dbt command in its command line interface and are stored in a folder named logs in the project folder.

delta Live Tables

  • The logs are directly integrated in the UI. More information about every event can be found when clicking on the particular event.
  • There are real-time live updates when running on the cluster.
  • The failing of data quality requirements are captured in the event log.
  • Automatic monitoring, recovery and management.

Comparison

Both have an extensive amount of logs, but only Delta Live Tables has real-time updates and a visual interface to follow the process.

Priyag1
Honored Contributor II

Programming language

In a team, not everyone knows the same programming languages, so the more languages that are supported, the better, because then the programming language that the most people in the tam know already will be chosen. This will speed up the development process, since now a smaller group needs to learn the language.

  • dbt

The only language supported in dbt is SQL. Some jinja needs to be known to create templates.

  • Delta Live Tables

The notebooks in which the tables are defined can be in SQL and python. A notebook cannot be in SQL and python at the same time, but different notebooks can be in different languages in the same project.

  • Comparison

Both support SQL, but only Delta Live Tables also supports python.

Data warehouses

Not every company has every kind of data warehouse at its disposal. To set up a data warehouse specific for a certain problem or tool while they already have a data warehouse is quite excessive. It is worthwhile to look for a tool that can be used for the data warehouse you already have, unless the tool for another data warehouse has a lot of benefits.

  • dbt

Various data warehouses can be used and are supported by dbt, while there are more that are community supported. For example for Databricks, dbt gets its tables from the Databricks hive metastore, but these can also link to various sources.

  • Delta Live Tables

It is directly integrated into Databricks, so also sources that can be loaded into the Databricks hive metastore can be used.

  • Comparison

Both can make use of different data sources such as a data lake, but only dbt can be used in combination with and ran against other data warehouses.

Priyag1
Honored Contributor II

@Prachi Sankhalaโ€‹ These are some of the use cases

Anonymous
Not applicable

Hi @Prachi Sankhalaโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group