โ08-17-2024 09:40 PM
Hello team,
I have a requirement of moving all the table from Azure Synapse (dedicated sql pool) to databricks.
we have a data coming up from source to azure data lake frequently.
we have Azure data factory to load data (data flow does the basic transformation ) to Synapse.
we are looking to migrate data(tables) from synapse to databricks as a requirement.
And the new data from azure data lake to be ingested directly to databricks bypassing synapse.
can you please let me know the best approach to perform this action.
your insights are valuable!! Thanks
a month ago
@Krizofe You can use Azure Databricks to directly query and load data from Azure Synapse using Apache Spark. Databricks has built-in connectors that allow it to read from Synapse (or its underlying SQL Data Warehouse).
load it from synapse and then write it to ADLS .
โ08-17-2024 11:21 PM
Use Azure Data Factory (ADF):
Configure a pipeline in ADF to copy data from Synapse SQL to Azure Data Lake. Set up an ADF Copy Activity to handle this data transfer.
Source Dataset: Azure Synapse SQL table.
Sink Dataset: Azure Data Lake Storage or Azure Blob Storage.
Data Movement Activity: Use the Copy Data activity in ADF.
Create Delta Tables in Databricks:
Automate Data Ingestion:
Adjust ADF Pipelines:
Update your Azure Data Factory pipelines to load new data directly into Databricks instead of Synapse.
Source Dataset: Azure Data Lake Storage.
Sink Dataset: Databricks Delta table.
โ08-17-2024 11:37 PM
Hello Rishabh Pandey,
Thanks for the reply but just have some concerns.
The data volume is huge like in billions and data gets deleted once loaded to Synapse.
Pushing all those data back to Azure Storage account from Synapse will be time consuming and costing more on storage. (Note: we have 3k tables with billions of rows)
So, we have any approach to directly move data to Databricks without staging it to Azure storage.
Thanks for the reply, mate.
Your insights will be valuable.
a month ago
@Krizofe You can use Azure Databricks to directly query and load data from Azure Synapse using Apache Spark. Databricks has built-in connectors that allow it to read from Synapse (or its underlying SQL Data Warehouse).
load it from synapse and then write it to ADLS .
a month ago
Hi @Krizofe, Thanks for reaching out! Please review the response and let us know if it answers your question. Your feedback is valuable to us and the community.
If the response resolves your issue, kindly mark it as the accepted solution. This will help close the thread and assist others with similar queries.
We appreciate your participation and are here if you need further assistance!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group