cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

You need to pass the data from adf to tables In delta table or df in databricks how you do it

learning_1989
New Contributor II

You need to pass the data from adf to tables In delta table or df in databricks how you do it

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

learning_1989
New Contributor II

Thanks a lot for the explanation

Kaniz
Community Manager
Community Manager

Hey there! Thanks a bunch for being part of our awesome community! ๐ŸŽ‰ 

 

We love having you around and appreciate all your questions. Take a moment to check out the responses โ€“ you'll find some great info. Your input is valuable, so pick the best solution for you. And remember, if you ever need more help , we're here for you! 

 

Keep being awesome! ๐Ÿ˜Š๐Ÿš€

 

 

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.