Databricks Community

Avinash_Narala · ‎01-24-2025

Hi,

I want to build a python function to migrate SQL Server tables to Databricks.

Is there any guide/ best practices on how to do so.

It'll be really helpful if there is any.

Regards,

Avinash N

filipniziol · ‎01-26-2025

Hi @Avinash_Narala ,

If it is lift and shift, then try this:

1. Set up Lakehouse Federation to SQL Server

2. Use CTAS statements to copy each table into Unity Catalog

CREATE TABLE catalog_name.schema_name.table_name
AS 
SELECT *
FROM sql_server_catalog_name.sql_server_schema_name.your_sql_server_table

3. Loop through a list of tables in Python for automation

tables_to_migrate = [
    ("sql_server_schema_name", "SQL_TABLE_A", "unity_catalog_schema", "DEST_TABLE_A"),
    ("sql_server_schema_name", "SQL_TABLE_B", "unity_catalog_schema", "DEST_TABLE_B")
]

for src_schema, src_table, dest_schema, dest_table in tables_to_migrate:
    spark.sql(f"""
      CREATE TABLE my_catalog.{dest_schema}.{dest_table} 
      AS
      SELECT *
      FROM sql_server_catalog_name.{src_schema}.{src_table}
    """)

View solution in original post

Edward357J · ‎01-25-2025

@Avinash_Narala wrote:
Hi,
I want to build a python function to migrate SQL Server tables to Databricks.
Is there any guide/ best practices on how to do so.
It'll be really helpful if there is any.
Regards,
Avinash N

Hello!

To migrate SQL Server tables to Databricks, ensure your SQL Server meets the version requirements and enable change tracking or CDC. Set up a dedicated database user for Databricks ingestion and use Databricks' LakeFlow Connect or other connectors to establish a connection between SQL Server and Databricks. Create an ingestion pipeline to transfer data and schedule it, monitoring its performance to ensure smooth data transfer. For detailed steps and best practices, refer to the official Databricks documentation and community forums.

I hope this helps, Avinash!

filipniziol · ‎01-25-2025

Hi @Avinash_Narala ,

Can you share in detail what are you planning to do?
Do you want to:
1. create only table structure or create tables and migrate the data

2. if that's the data migration, is it one-time operation or do you want your SQL Server to be the data source for some regular data loads?

Avinash_Narala · ‎01-26-2025

Hi @filipniziol ,

I want to create table structure and migrate data to databricks, it is lift and shift, so only one-time operation.

filipniziol · ‎01-26-2025

Hi @Avinash_Narala ,

If it is lift and shift, then try this:

1. Set up Lakehouse Federation to SQL Server

2. Use CTAS statements to copy each table into Unity Catalog

CREATE TABLE catalog_name.schema_name.table_name
AS 
SELECT *
FROM sql_server_catalog_name.sql_server_schema_name.your_sql_server_table

3. Loop through a list of tables in Python for automation

tables_to_migrate = [
    ("sql_server_schema_name", "SQL_TABLE_A", "unity_catalog_schema", "DEST_TABLE_A"),
    ("sql_server_schema_name", "SQL_TABLE_B", "unity_catalog_schema", "DEST_TABLE_B")
]

for src_schema, src_table, dest_schema, dest_table in tables_to_migrate:
    spark.sql(f"""
      CREATE TABLE my_catalog.{dest_schema}.{dest_table} 
      AS
      SELECT *
      FROM sql_server_catalog_name.{src_schema}.{src_table}
    """)

Avinash_Narala · ‎01-26-2025

Is there any way to 1st create only metadata(table structure) and then migrate the data.

filipniziol · ‎01-27-2025

What you can do is for example just add a WHERE condition in CTAS like:

CREATE TABLE ...
AS 
SELECT *
FROM 
  ...
WHERE 1 = 0

Since no records meet the condition, only the table structure is created.

Also, this is a simplistic solution that should be considered as a base for your script.
In real-world you may want to add liquid clustering, so your list with mapping would include another field with clustering columns and the script would be modified to include liquid clustering if needed etc.

Avinash_Narala · ‎01-27-2025

It looks good, but doesn't it take more time to execute , i.e, in case if we have a huge table, we need only table structure but scanning whole table for the condition 1=0?

Databricks Community

SQL Server to Databricks Migration

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April