Databricks Community

shubham007 · Sunday

Dear community expert,

I have completed two phases Analyzer & Converter of Databricks Lakebridge but stuck at migrating data from source to target using lakebridge. I have watched BrickBites Series on Lakebridge but did not find on how to migrate data from source to target. I need guidance on how I should migrate data like tables, views from source to target so that I can proceed for reconciliation phase to validate schema and data. As, I have taken Snowflake as a source with few sample tables and queries to migrate on Databricks Platform as a target.

Thank you !!

bianca_unifeye · Tuesday

Lakebridge doesn’t copy data. It covers Assessment → Conversion (Analyzer/Converter) → Reconciliation.

The fastest way is to use Lakehouse Federation. Create a Snowflake connection in Unity Catalog and run federated queries from Databricks. For permanent migration, materialize with CREATE TABLE AS SELECT into Delta.

mark_ott · Wednesday

To migrate tables and views from Snowflake (source) to Databricks (target) using Lakebridge, you must export your data from Snowflake into a supported cloud storage (usually as Parquet files), then import these files into Databricks Delta tables. Lakebridge simplifies the overall process—especially for code, schema conversion, and reconciliation—but the physical migration of large volumes of data is performed via staging to cloud storage and loading into Databricks.

Step-by-Step Data Migration Process

1. Export Data from Snowflake

Use the Snowflake COPY INTO command to export your tables as Parquet files to a cloud storage bucket (S3, ADLS, or GCS). Example:

text

COPY INTO 's3://your-bucket/path/' FROM my_database.my_schema.my_table FILE_FORMAT = (TYPE = PARQUET COMPRESSION = SNAPPY);
For large tables, use partitioning for export efficiency (e.g., partition by date column).

2. Set Up Cloud Storage Access in Databricks

Ensure Databricks has access to your storage location by configuring the necessary credentials and permissions.

3. Load Data into Databricks Delta Tables

Use Databricks Notebooks or workflows to create Delta tables and load data from Parquet files:

python

df = spark.read.format("parquet").load("s3://your-bucket/path/") df.write.format("delta").save("/mnt/delta/target_table")
For continuous/streaming loads, use Auto Loader or Databricks workflows for incremental or live updates.

4. Migrate and Create Views/SQL Logic

After data migration, convert your Snowflake views and SQL queries using Lakebridge’s Converter. Validate and translate SQL, and deploy scripts in Databricks as new views.

5. Reconciliation Preparation

Once data and views are migrated, use Lakebridge’s reconciliation tools to compare row counts, aggregates, and schemas between Snowflake and Databricks to ensure fidelity.

Key Reminders

For metadata (schema, DDLs), leverage Lakebridge Analyzer and Converter.
For data movement, use Parquet via cloud storage as the most compatible path.
Automation: For many tables, script the process or employ Databricks batch jobs for efficiency.
After migration, update your BI or analytics tools to point to Databricks tables/views.

This approach enables a robust, auditable pipeline that supports reconciliation and validation for accurate migration outcomes, which is vital before advancing to the reconciliation phase.