cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Ingestion Framework

fjrodriguez
New Contributor II

I would to like to update my ingestion framework that is orchestrated by ADF, running couples Databricks notebook and copying the data to DB afterwards. I want to rely everything on Databricks i though this could be the design:

Step 1. Expose target tables in Unity Catalog Create a UC External Connection to Azure SQL DB, create the table using this connection ? is this needed ?

Step 2. Rewrite Stored Procedure logic in UC as SP cannot be mounted as same we do for tables or views, suggest to rewrite it.

Step 3. Orchestrate in Databricks Workflows Create a workflow job with tasks: Notebook task → does validation / preprocessing if needed SQL task → executes CALL SP managed by UC Downstream tasks → e.g., write results to DB." 

 

The motivation is to leverage the use of Unity Catalog and Workflows and deprecated ADF in future, but i started this PoC as a pre step.

 

Is my assumption correct ? any recommendation?

1 ACCEPTED SOLUTION

Accepted Solutions

saurabh18cs
Honored Contributor II

Hi @fjrodriguez your assumptions are correct.

1) If you want to query or write to Azure SQL DB directly from Databricks SQL (using Unity Catalog), you need to create an External Connection in Unity Catalog and then define External Tables that point to your Azure SQL DB tables. This is not strictly required unless you want to manage these tables in UC for lineage and governance features.

2) UC supports Stored procedures but not same as T-SQL procedures and cannot either call them. re-write as sp or as a notebook sql tasks in your workflow.

3) databricks workflows are sufficient to take over adf pipelines like orchestration, validation etc

View solution in original post

1 REPLY 1

saurabh18cs
Honored Contributor II

Hi @fjrodriguez your assumptions are correct.

1) If you want to query or write to Azure SQL DB directly from Databricks SQL (using Unity Catalog), you need to create an External Connection in Unity Catalog and then define External Tables that point to your Azure SQL DB tables. This is not strictly required unless you want to manage these tables in UC for lineage and governance features.

2) UC supports Stored procedures but not same as T-SQL procedures and cannot either call them. re-write as sp or as a notebook sql tasks in your workflow.

3) databricks workflows are sufficient to take over adf pipelines like orchestration, validation etc

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now