Hi @learning_1989, In order to seamlessly transfer data from Azure Data Factory (ADF) to Delta tables in Databricks, the best approach is to utilize the powerful Copy activity available in both Azure Data Factory and Azure Synapse.
Here's a breakdown of the necessary steps:
1. Begin by setting up a cluster within Azure Databricks to support your data movement needs.
2. Next, create a linked service that connects to your Azure Databricks Delta Lake.
3. Utilize the Copy activity to effortlessly move data between your chosen source data store and the Delta Lake table in Azure Databricks. This activity can also effectively transfer data from the Delta Lake table back to any supported sink data store.
4. The seamless data movement is made possible by utilizing the processing power of your Databricks cluster through the utilization of the Copy activity.
When transferring data to the delta lake, the Copy activity utilizes the Azure Databricks cluster to retrieve information from Azure Storage. This storage can either be the initial source or a designated staging area where the service first saves the source data. Likewise, when moving data from the delta lake, the Copy activity calls upon the Azure Databricks cluster to store the information in Azure Storage. Additionally, through the REST API2, custom parameters can be passed to a Delta Live Table pipeline when invoked from Azure Data Factory.
Attention: The Databricks cluster must be able to access both an Azure Blob or Azure data lake Storage Gen2 account. This is necessary for the storage container or file system used for source, sink, or staging, as well as the container or file system where you plan to write the Delta Lake tables. If you want to use Azure Data Lake Storage Gen2, make sure to set up a service principal on the Databricks cluster as part of your Apache Spark configuration.