3 weeks ago
Dear community expert,
I have completed two phases Analyzer & Converter of Databricks Lakebridge but stuck at migrating data from source to target using lakebridge. I have watched BrickBites Series on Lakebridge but did not find on how to migrate data from source to target. I need guidance on how I should migrate data like tables, views from source to target so that I can proceed for reconciliation phase to validate schema and data. As, I have taken Snowflake as a source with few sample tables and queries to migrate on Databricks Platform as a target.
Thank you !!
3 weeks ago
Lakebridge doesnโt copy data. It covers Assessment โ Conversion (Analyzer/Converter) โ Reconciliation.
The fastest way is to use Lakehouse Federation. Create a Snowflake connection in Unity Catalog and run federated queries from Databricks. For permanent migration, materialize with CREATE TABLE AS SELECT into Delta.
3 weeks ago
To migrate tables and views from Snowflake (source) to Databricks (target) using Lakebridge, you must export your data from Snowflake into a supported cloud storage (usually as Parquet files), then import these files into Databricks Delta tables. Lakebridge simplifies the overall processโespecially for code, schema conversion, and reconciliationโbut the physical migration of large volumes of data is performed via staging to cloud storage and loading into Databricks.โโ
Use the Snowflake COPY INTO command to export your tables as Parquet files to a cloud storage bucket (S3, ADLS, or GCS). Example:
COPY INTO 's3://your-bucket/path/'
FROM my_database.my_schema.my_table
FILE_FORMAT = (TYPE = PARQUET COMPRESSION = SNAPPY);
For large tables, use partitioning for export efficiency (e.g., partition by date column).โ
Ensure Databricks has access to your storage location by configuring the necessary credentials and permissions.
Use Databricks Notebooks or workflows to create Delta tables and load data from Parquet files:
df = spark.read.format("parquet").load("s3://your-bucket/path/")
df.write.format("delta").save("/mnt/delta/target_table")
For continuous/streaming loads, use Auto Loader or Databricks workflows for incremental or live updates.โ
After data migration, convert your Snowflake views and SQL queries using Lakebridgeโs Converter. Validate and translate SQL, and deploy scripts in Databricks as new views.
Once data and views are migrated, use Lakebridgeโs reconciliation tools to compare row counts, aggregates, and schemas between Snowflake and Databricks to ensure fidelity.โ
For metadata (schema, DDLs), leverage Lakebridge Analyzer and Converter.
For data movement, use Parquet via cloud storage as the most compatible path.
Automation: For many tables, script the process or employ Databricks batch jobs for efficiency.โ
After migration, update your BI or analytics tools to point to Databricks tables/views.
This approach enables a robust, auditable pipeline that supports reconciliation and validation for accurate migration outcomes, which is vital before advancing to the reconciliation phase.
3 weeks ago
Lakebridge doesnโt copy data. It covers Assessment โ Conversion (Analyzer/Converter) โ Reconciliation.
The fastest way is to use Lakehouse Federation. Create a Snowflake connection in Unity Catalog and run federated queries from Databricks. For permanent migration, materialize with CREATE TABLE AS SELECT into Delta.
3 weeks ago
To migrate tables and views from Snowflake (source) to Databricks (target) using Lakebridge, you must export your data from Snowflake into a supported cloud storage (usually as Parquet files), then import these files into Databricks Delta tables. Lakebridge simplifies the overall processโespecially for code, schema conversion, and reconciliationโbut the physical migration of large volumes of data is performed via staging to cloud storage and loading into Databricks.โโ
Use the Snowflake COPY INTO command to export your tables as Parquet files to a cloud storage bucket (S3, ADLS, or GCS). Example:
COPY INTO 's3://your-bucket/path/'
FROM my_database.my_schema.my_table
FILE_FORMAT = (TYPE = PARQUET COMPRESSION = SNAPPY);
For large tables, use partitioning for export efficiency (e.g., partition by date column).โ
Ensure Databricks has access to your storage location by configuring the necessary credentials and permissions.
Use Databricks Notebooks or workflows to create Delta tables and load data from Parquet files:
df = spark.read.format("parquet").load("s3://your-bucket/path/")
df.write.format("delta").save("/mnt/delta/target_table")
For continuous/streaming loads, use Auto Loader or Databricks workflows for incremental or live updates.โ
After data migration, convert your Snowflake views and SQL queries using Lakebridgeโs Converter. Validate and translate SQL, and deploy scripts in Databricks as new views.
Once data and views are migrated, use Lakebridgeโs reconciliation tools to compare row counts, aggregates, and schemas between Snowflake and Databricks to ensure fidelity.โ
For metadata (schema, DDLs), leverage Lakebridge Analyzer and Converter.
For data movement, use Parquet via cloud storage as the most compatible path.
Automation: For many tables, script the process or employ Databricks batch jobs for efficiency.โ
After migration, update your BI or analytics tools to point to Databricks tables/views.
This approach enables a robust, auditable pipeline that supports reconciliation and validation for accurate migration outcomes, which is vital before advancing to the reconciliation phase.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now