cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Urgency: How to do Data Migration task using Databricks Lakebridge tool ?

shubham007
New Contributor III

Dear community expert,

I have completed two phases Analyzer & Converter of Databricks Lakebridge but stuck at migrating data from source to target using lakebridge. I have watched BrickBites Series on Lakebridge but did not find on how to migrate data from source to target. I need guidance on how I should migrate data like tables, views from source to target so that I can proceed for reconciliation phase to validate schema and data. As, I have taken Snowflake as a source with few sample tables and queries to migrate on Databricks Platform as a target.

Thank you !!

2 REPLIES 2

bianca_unifeye
New Contributor III

Lakebridge doesn’t copy data. It covers Assessment → Conversion (Analyzer/Converter) → Reconciliation.

The fastest way is to use Lakehouse Federation. Create a Snowflake connection in Unity Catalog and run federated queries from Databricks. For permanent migration, materialize with CREATE TABLE AS SELECT into Delta.

mark_ott
Databricks Employee
Databricks Employee

To migrate tables and views from Snowflake (source) to Databricks (target) using Lakebridge, you must export your data from Snowflake into a supported cloud storage (usually as Parquet files), then import these files into Databricks Delta tables. Lakebridge simplifies the overall process—especially for code, schema conversion, and reconciliation—but the physical migration of large volumes of data is performed via staging to cloud storage and loading into Databricks.​​

Step-by-Step Data Migration Process

1. Export Data from Snowflake

  • Use the Snowflake COPY INTO command to export your tables as Parquet files to a cloud storage bucket (S3, ADLS, or GCS). Example:

    text
    COPY INTO 's3://your-bucket/path/' FROM my_database.my_schema.my_table FILE_FORMAT = (TYPE = PARQUET COMPRESSION = SNAPPY);
  • For large tables, use partitioning for export efficiency (e.g., partition by date column).​

2. Set Up Cloud Storage Access in Databricks

  • Ensure Databricks has access to your storage location by configuring the necessary credentials and permissions.

3. Load Data into Databricks Delta Tables

  • Use Databricks Notebooks or workflows to create Delta tables and load data from Parquet files:

    python
    df = spark.read.format("parquet").load("s3://your-bucket/path/") df.write.format("delta").save("/mnt/delta/target_table")
  • For continuous/streaming loads, use Auto Loader or Databricks workflows for incremental or live updates.​

4. Migrate and Create Views/SQL Logic

  • After data migration, convert your Snowflake views and SQL queries using Lakebridge’s Converter. Validate and translate SQL, and deploy scripts in Databricks as new views.

5. Reconciliation Preparation

  • Once data and views are migrated, use Lakebridge’s reconciliation tools to compare row counts, aggregates, and schemas between Snowflake and Databricks to ensure fidelity.​

Key Reminders

  • For metadata (schema, DDLs), leverage Lakebridge Analyzer and Converter.

  • For data movement, use Parquet via cloud storage as the most compatible path.

  • Automation: For many tables, script the process or employ Databricks batch jobs for efficiency.​

  • After migration, update your BI or analytics tools to point to Databricks tables/views.

This approach enables a robust, auditable pipeline that supports reconciliation and validation for accurate migration outcomes, which is vital before advancing to the reconciliation phase.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now