cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Anyone using Databricks Query Federation for ETL purpose ?

dipanjannet
New Contributor II

Hello All,

We have a use case to fetch data from a SQL Server wherein we have some tables to consume. This is typically a OLTP setup wherein the comes in a regular interval.  

Now, as we have Unity Catalog enabled, we are interested in exploring Databricks Query federation to ingest the data into our Storage Account. Please note, we do not jut want to query the SQL Server tables. 

We want to do one time full load and then incremental load. 

Seeking for a suggestion for this use case. In the databricks documentation it primarily suggesting for below - 

  • Ad hoc reporting.
  • Proof-of-concept work.
  • The exploratory phase of new ETL pipelines or reports.
  • Supporting workloads during incremental migration.

However, in this case - this is a fully flagged ETL workflow. Any suggestion on this setup ?

#Query Federation 

 

2 REPLIES 2

nikhilj0421
Databricks Employee
Databricks Employee

Hi @dipanjannet, you can leverage DLT feature to do so. 

Please check: https://docs.databricks.com/aws/en/dlt/transform

https://docs.databricks.com/aws/en/dlt/stateful-processing

Here is the step-by-step tutorial: https://docs.databricks.com/aws/en/dlt/tutorials

 

In case you do not want to use dlt (and there are reasons not to), you can also check the docs for autoloader and merge notebooks

These 2 do basically the same as dlt but without the extra cost and more control.  You have to write more code though.
For ingesting the SQL server data I would use Data Factory, which lands the data onto your bronze layer (adls gen2).
Or use the Azure SQL connector of Databricks, but that will use DLT and is more expensive than ADF but has the ease of use (but less control/visibility).
So you see, many choices.

 

dipanjannet
New Contributor II

Hello @nikhilj0421 - Thank you for help responding. The question is not about DLT. The Question is what is the use case of Databricks Query Federation? If we plug Query Federation - what are the implications ? What databricks is suggesting for that?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now