cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Opinions/Thoughts: SQL Best Practices in Production .. DBT vs DLT ?

BS_THE_ANALYST
Honored Contributor III

Hey everyone, 

I'd like to hear the experiences of the community on DLT (Lakeflow declarative pipelines) vs DBT. 

1. Why would one choose one instead of the other? 
2. How does picking one of these level up your SQL strategy?

I am somebody who's well-versed in using SQL to answer business problems (more of an analyst). I'd love to know the next steps for levelling up & using best practices.

All the best,
BS

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @BS_THE_ANALYST ,

Thatโ€™s a really good question! I donโ€™t think thereโ€™s a single definitive answer, since the choice depends on several factors. Each framework has its own advantages and disadvantages.

In my opinions strengths of dbt are:

- Wide adoption - dbt is very popular, with a large and active community.

- SQL on steriods - you can write transformations purely in SQL, but dbt also supports Jinja templating, which makes it easier to add loops, conditionals, and other logic.

- Auto-generated documentation -

- Support for multiple environments - it fits nicely into dev/test/prod setups.

- Familiarity for analysts - since itโ€™s SQL-based, a team of BAs or analysts with strong SQL skills can be productive quickly.

- Open source -  community-driven and not tied to a single vendor.

That said, dbt has limitations. It focuses mainly on the โ€œTโ€ in ETL/ELT. It doesnโ€™t handle the extraction part, and its orchestration capabilities are more limited compared to native tools.

As about strengths of Declarative Pipelines:

- Supports ingestion as well as transformation - full ETL process 

- SQL and Python support - Python provides more flexibility, automation, and complex transformations for experienced teams.

- Tight integration with Databricks Unity Catalog and the broader ecosystem - since it's a native product, it often feels more seamless and coherent.

- Efficient incremental loading - I think DLT should handle incremental loading in a better manner thanks to features like the Enzyme engine and auto-optimization

- Ifrastructure management - DLT manages the underlying compute resources and integrates with Databricks Workflows, while dbt requires external orchestration tools.

- Streaming: DLT has native support for streaming data, whereas dbt can handle streaming via the dbt-databricks package.

On the other hand, there are trade-offs. The entry barrier is higher, and for now, thereโ€™s still an element of vendor lock-in. Databricks has announced plans to donate Declarative Pipelines to open source, but feature parity isnโ€™t there yet, and practically speaking, choosing it means committing to the Databricks ecosystem.

My personal take: If your entire platform is already built on Databricks, Declarative Pipelines are a strong choice. If you need flexibility to run your pipelines on other databases or cloud platforms, dbt might be the safer choice.

View solution in original post

2 REPLIES 2

szymon_dybczak
Esteemed Contributor III

Hi @BS_THE_ANALYST ,

Thatโ€™s a really good question! I donโ€™t think thereโ€™s a single definitive answer, since the choice depends on several factors. Each framework has its own advantages and disadvantages.

In my opinions strengths of dbt are:

- Wide adoption - dbt is very popular, with a large and active community.

- SQL on steriods - you can write transformations purely in SQL, but dbt also supports Jinja templating, which makes it easier to add loops, conditionals, and other logic.

- Auto-generated documentation -

- Support for multiple environments - it fits nicely into dev/test/prod setups.

- Familiarity for analysts - since itโ€™s SQL-based, a team of BAs or analysts with strong SQL skills can be productive quickly.

- Open source -  community-driven and not tied to a single vendor.

That said, dbt has limitations. It focuses mainly on the โ€œTโ€ in ETL/ELT. It doesnโ€™t handle the extraction part, and its orchestration capabilities are more limited compared to native tools.

As about strengths of Declarative Pipelines:

- Supports ingestion as well as transformation - full ETL process 

- SQL and Python support - Python provides more flexibility, automation, and complex transformations for experienced teams.

- Tight integration with Databricks Unity Catalog and the broader ecosystem - since it's a native product, it often feels more seamless and coherent.

- Efficient incremental loading - I think DLT should handle incremental loading in a better manner thanks to features like the Enzyme engine and auto-optimization

- Ifrastructure management - DLT manages the underlying compute resources and integrates with Databricks Workflows, while dbt requires external orchestration tools.

- Streaming: DLT has native support for streaming data, whereas dbt can handle streaming via the dbt-databricks package.

On the other hand, there are trade-offs. The entry barrier is higher, and for now, thereโ€™s still an element of vendor lock-in. Databricks has announced plans to donate Declarative Pipelines to open source, but feature parity isnโ€™t there yet, and practically speaking, choosing it means committing to the Databricks ecosystem.

My personal take: If your entire platform is already built on Databricks, Declarative Pipelines are a strong choice. If you need flexibility to run your pipelines on other databases or cloud platforms, dbt might be the safer choice.

BS_THE_ANALYST
Honored Contributor III

That's a cracking write up @szymon_dybczak. Thanks for that ๐Ÿค.

That's certainly given me some food for thought. I think the safest option here, at least for me, is digging into both of them. I feel better informed moving forward with this. 

I'd love to hear about people's experiences/thoughts on working with either DBT/DLT and how they've found it in practice? The pros and cons they've found, and most importantly, what they'd have done differently whilst adopting DBT or DLT.

All the best,
BS

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now