cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

You need to pass the data from adf to tables In delta table or df in databricks how you do it

learning_1989
New Contributor II

You need to pass the data from adf to tables In delta table or df in databricks how you do it

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

learning_1989
New Contributor II

Thanks a lot for the explanation

Kaniz
Community Manager
Community Manager

Hey there! Thanks a bunch for being part of our awesome community! 🎉 

 

We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution for you. And remember, if you ever need more help , we're here for you! 

 

Keep being awesome! 😊🚀

 

 

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!