cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

You need to pass the data from adf to tables In delta table or df in databricks how you do it

learning_1989
New Contributor II

You need to pass the data from adf to tables In delta table or df in databricks how you do it

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

View solution in original post

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse. 

 

Here's a breakdown of the necessary steps: 

 

1. Begin by setting up a cluster within Azure Databricks to support your data movement needs. 

2. Next, create a linked service that connects to your Azure Databricks Delta Lake. 

3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store. 

4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.

 

When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.

 

Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.

Thanks a lot for the explanation

Kaniz_Fatma
Community Manager
Community Manager

Hey there! Thanks a bunch for being part of our awesome community! ๐ŸŽ‰ 

 

We love having you around and appreciate all your questions. Take a moment to check out the responses โ€“ you'll find some great info. Your input is valuable, so pick the best solution for you. And remember, if you ever need more help , we're here for you! 

 

Keep being awesome! ๐Ÿ˜Š๐Ÿš€

 

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group