Migrating huge table from synapse to databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-21-2024 07:55 AM
Hi ,
We are looking for a option to copy tables more than 50 TB to be copied from syanpse to databricks on weekly basis , please suggest of there are any feasible ways for same
we are using connector but it is taking too long to copy
https://learn.microsoft.com/en-us/azure/databricks/connect/external-systems/synapse-analytics
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-21-2024 08:05 AM
I suppose this is a Dedicated Sql pool you are talking about?
If so, I'd use Data Factory to extract the data from synapse and land it in ADLS in parquet format f.e. Then create a UC/Hive table on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-21-2024 08:11 AM
Thanks Warners.
Any documents related to same from databricks end which we can refer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-22-2024 12:25 AM
There is no databricks documentation on this as it is only involved for a very tiny bit:
"CREATE TABLE catalog.schema.table USING PARQUET LOCATION 'url_to_the_parquet_files'.
All the rest is done in Azure Data Factory, or you can even use the built-in Pipelines of Synapse (in Synapse Studio).
There is a standard connector for Synapse and ADLS. The pipeline itself converts the data to parquet and writes is.
When done, do the create table on databricks.
Copy and transform data in Azure Synapse Analytics - Azure Data Factory & Azure Synapse | Microsoft ...

