cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to Use BladeBridge for Redshift to Databricks Migration?

Akshay_Petkar
Contributor III

Hi all,

I have a Redshift queries that I need to migrate to Databricks using BladeBridge, but I have never used BladeBridge before and canโ€™t find any clear documentation or steps on how to use it within the Databricks environment.

If anyone has already implemented BladeBridge for Redshift or any other warehouse to Databricks conversion, Iโ€™d really appreciate it if you could share:

  • Your experience or approach

  • Any documentation or links that helped you get started

Thanks in advance for your support!

Akshay Petkar
2 REPLIES 2

lingareddy_Alva
Honored Contributor III

Hi @Akshay_Petkar 

Migrating Amazon Redshift SQL to Databricks (especially Delta Lake or Unity Catalog-backed systems) using BladeBridge is a practical yet less-documented use case.
Since BladeBridge is a commercial tool with limited public documentation, here's a consolidated response based on real-world usage patterns, typical migration steps,
and best practices gathered from enterprise implementations.

BladeBridge for Redshift to Databricks Migration
BladeBridge operates as a code translation framework, helping automate SQL/ETL conversions with configurable rule engines.
For Redshift to Databricks (SQL/Delta/Unity Catalog), this often means:
- Parsing Redshift SQL (DDL, DML, Views, Functions)
- Translating syntax, types, and warehouse-specific constructs to Spark SQL / Databricks SQL
- Packaging output as Databricks notebooks, dbt models, or SQL scripts.

Step-by-Step Migration Flow:
Step 1: Set Up BladeBridge
- Get access to the BladeBridge environment (either via your enterprise license or BladeBridge-managed services).
- Work with BladeBridge support to enable Redshift as source and Databricks (Delta Lake/Spark SQL) as the target.

Step 2: Extract Redshift Code
- Use BladeBridgeโ€™s metadata extractor or CLI to scan your Redshift warehous.
- This typically includes:
1. Stored procedures
2. UDFs
3. Views
4. Complex SQL queries
5. ETL control logic (if embedded)

Step 3: Define Mapping Rules
BladeBridge uses a rule-based translation engine. You or the BladeBridge team will:
- Map Redshift-specific functions (e.g., DISTINCT ON, ENCODE, STL_* system tables) to Databricks-compatible alternatives.
- Handle data type conversions (SUPER, GEOMETRY, etc. โ†’ struct/JSON or compatible formats)
- Replace Redshift-specific syntax with Spark/Databricks equivalents.

Step 4: Generate Target Code
- BladeBridge will generate:
1 Spark SQL / Databricks SQL scripts
2 Optional: PySpark or Scala code if procedural logic needs translation
3 Notebooks (.dbc or .ipynb)
4 dbt-compatible models (if configured)

Step 5: Validation & QA
- BladeBridge offers data diffing / validation capabilities to compare Redshift and Databricks output.
- Integrate with Great Expectations or Delta Live Tables expectations if needed.
- Unit tests and volume-based data checks are essential post-conversion.

Step 6: Deployment
- Load converted code into Databricks (via Workspace API, Git sync, or notebooks).
- Use Databricks Jobs or Workflows to orchestrate converted SQL pipelines.
- Set up access permissions if you're using Unity Catalog.


Ask BladeBridge for:
- Redshift โ†’ Databricks Conversion Mapping Guide
- Rule Engine Customization Manual
- CLI/SDK Usage Docs
- Check with your Databricks TAM (if enterprise) โ€” they often co-pilot BladeBridge-based migrations.

My Suggestion:
If this is your first time, request BladeBridge to:
- Do a pilot migration of 10โ€“20 complex querie
- Provide documentation for custom rules
- Clarify the translation logic visibility so you can tune it in-house later

 

 

 

LR

ddharma
New Contributor II

Dear @lingareddy_Alva ,

Thank you so much for sharing these steps & specifics. Much appreciated!

Context:

Have just started exploring BladeBridge for AWS Redshift to Databricks migration. 

"BladeBridge operates as a code translation framework" and it supports and provides many other activities as part of the e2e data migration process.

The reconcile step in the official documentation here shows reconciliation between source and target. 

But there is NO mention of actual data flow / data migration / Redshift Unload etc anywhere.

Question:

Is the actual data movement or I/O from source (Redshift) to target (Databricks) NOT handled by BladeBridge? OOB? 

Thanks & Regards,

Dileep 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now