Databricks Community

abhijit007 · 2 weeks ago

We are currently performing an assessment for a client’s Redshift to Databricks migration, and we would like to better understand the enhanced capabilities of Lakebridge for this use case.

We would appreciate clarification on the following points:

Scope of Lakebridge capabilities
Is Lakebridge capable of handling:
Data migration
Metadata migration
Workflow / orchestration migration

Migration lifecycle coverage
Does Lakebridge support the full migration lifecycle, including discovery, analysis, and conversion?
My earlier understanding was that it primarily supported analysis, so I would like to confirm the current capabilities.

Licensing and cost
I understand that Lakebridge is an open‑source project. In its current form or recent enhancements, is there any licensing cost or commercial offering involved?

Databricks best practices
Are there any official Databricks‑recommended best practices or reference architectures specifically for Redshift to Databricks migrations, especially when using Lakebridge?

Thanks in advance for your guidance and insights.
Any details or references would be greatly appreciated.

#lakebridge #redshift

Ashwin_DSA · 2 weeks ago

Hi @abhijit007

For a Redshift --> Databricks migration, Lakebridge is designed to automate the code and metadata side of the migration and help you validate results on Databricks. Lakebridge does not copy data out of Redshift itself. Data movement is typically handled via Databricks Lakeflow/native connectors/cloud data-migration tools, with Lakebridge used to profile the estate and then validate the data on Databricks once it has landed. Lakebridge can scan your Redshift SQL and objects, estimate migration effort, and convert a large portion of SQL and DDL into Databricks SQL, surfacing what needs manual review. Lakebridge also focuses on the SQL and ETL logic inside pipelines. Orchestrators (e.g., Airflow, Step Functions, schedulers) are typically re-implemented or integrated with Databricks Lakeflow Jobs as part of the overall migration plan.

Lakebridge supports most of the technical migration lifecycle such as discovery & assessment which involves Profiling + Analyzer to inventory objects, classify complexity, and estimate effort and.... conversion which involves automated SQL/ETL conversion using a combination of deterministic converters and LLM-based translation for more complex patterns. It will also help with reconciling schemas and data (row counts, aggregates, column checks) between Redshift and Databricks to build confidence before the cut-over. However, it does not replace overall project management, cut-over planning, or data-movement plumbing, but plugs into those processes.

From a licensing and cost perspective, Lakebridge is a free Databricks migration tool. There is no license cost or consumption-based fee for using it. Parts of it (like Profiler and Reconcile) are open source. The Analyzer and converters are free but not open-sourced.

For Redshift to Databricks, we generally recommend:

Use the Redshift to Databricks Migration Guide to structure the project into clear phases (discovery, assessment, design, migration, validation).
Run Lakebridge to profile and assess the Redshift estate, then use its conversion capabilities to translate as much SQL/ETL as possible and highlight what needs manual refactoring.
Use your preferred data-movement tooling to land data in the lakehouse, then use Lakebridge reconciliation (and optionally partner tools like Datafold) to verify that the migrated workloads are correct before cut-over.

Here are some references that may help.

Lakebridge solution page: overarching capabilities and links to docs
Introducing Lakebridge blog: end-to-end migration story and architecture

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

Ashwin_DSA · 2 weeks ago

Hi @abhijit007

For a Redshift --> Databricks migration, Lakebridge is designed to automate the code and metadata side of the migration and help you validate results on Databricks. Lakebridge does not copy data out of Redshift itself. Data movement is typically handled via Databricks Lakeflow/native connectors/cloud data-migration tools, with Lakebridge used to profile the estate and then validate the data on Databricks once it has landed. Lakebridge can scan your Redshift SQL and objects, estimate migration effort, and convert a large portion of SQL and DDL into Databricks SQL, surfacing what needs manual review. Lakebridge also focuses on the SQL and ETL logic inside pipelines. Orchestrators (e.g., Airflow, Step Functions, schedulers) are typically re-implemented or integrated with Databricks Lakeflow Jobs as part of the overall migration plan.

Lakebridge supports most of the technical migration lifecycle such as discovery & assessment which involves Profiling + Analyzer to inventory objects, classify complexity, and estimate effort and.... conversion which involves automated SQL/ETL conversion using a combination of deterministic converters and LLM-based translation for more complex patterns. It will also help with reconciling schemas and data (row counts, aggregates, column checks) between Redshift and Databricks to build confidence before the cut-over. However, it does not replace overall project management, cut-over planning, or data-movement plumbing, but plugs into those processes.

From a licensing and cost perspective, Lakebridge is a free Databricks migration tool. There is no license cost or consumption-based fee for using it. Parts of it (like Profiler and Reconcile) are open source. The Analyzer and converters are free but not open-sourced.

For Redshift to Databricks, we generally recommend:

Use the Redshift to Databricks Migration Guide to structure the project into clear phases (discovery, assessment, design, migration, validation).
Run Lakebridge to profile and assess the Redshift estate, then use its conversion capabilities to translate as much SQL/ETL as possible and highlight what needs manual refactoring.
Use your preferred data-movement tooling to land data in the lakehouse, then use Lakebridge reconciliation (and optionally partner tools like Datafold) to verify that the migrated workloads are correct before cut-over.

Here are some references that may help.

Lakebridge solution page: overarching capabilities and links to docs
Introducing Lakebridge blog: end-to-end migration story and architecture

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

pradeep_singh · 2 weeks ago

There is a nice course on Partner Academy as well . It uses SQL Server as a target system for migration but you can follow the same steps for Redshift as well .

https://partner-academy.databricks.com/learn/courses/4326/lakebridge-for-sql-source-system-migration

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

Databricks Community

Redshift to Databricks Migration with Lakebridge

Solution Accelerator Series | Digital Twins

Community Alert: Free BrickTalk on Supply Chain Management with Databricks!

🌟 Community Pulse: Your Weekly Roundup! April 20 – 26, 2026

The Lakebase Hub: Official Community Space for Lakebase Insights

Take Control: Customer-Managed Keys for Lakebase Postgres