Databricks Community

Akshay_Petkar · ‎05-30-2025

Hi all,

I have a Redshift queries that I need to migrate to Databricks using BladeBridge, but I have never used BladeBridge before and can’t find any clear documentation or steps on how to use it within the Databricks environment.

If anyone has already implemented BladeBridge for Redshift or any other warehouse to Databricks conversion, I’d really appreciate it if you could share:

Your experience or approach
Any documentation or links that helped you get started

Thanks in advance for your support!

Akshay Petkar

lingareddy_Alva · ‎05-30-2025

Hi @Akshay_Petkar

Migrating Amazon Redshift SQL to Databricks (especially Delta Lake or Unity Catalog-backed systems) using BladeBridge is a practical yet less-documented use case.
Since BladeBridge is a commercial tool with limited public documentation, here's a consolidated response based on real-world usage patterns, typical migration steps,
and best practices gathered from enterprise implementations.

BladeBridge for Redshift to Databricks Migration
BladeBridge operates as a code translation framework, helping automate SQL/ETL conversions with configurable rule engines.
For Redshift to Databricks (SQL/Delta/Unity Catalog), this often means:
- Parsing Redshift SQL (DDL, DML, Views, Functions)
- Translating syntax, types, and warehouse-specific constructs to Spark SQL / Databricks SQL
- Packaging output as Databricks notebooks, dbt models, or SQL scripts.

Step-by-Step Migration Flow:
Step 1: Set Up BladeBridge
- Get access to the BladeBridge environment (either via your enterprise license or BladeBridge-managed services).
- Work with BladeBridge support to enable Redshift as source and Databricks (Delta Lake/Spark SQL) as the target.

Step 2: Extract Redshift Code
- Use BladeBridge’s metadata extractor or CLI to scan your Redshift warehous.
- This typically includes:
1. Stored procedures
2. UDFs
3. Views
4. Complex SQL queries
5. ETL control logic (if embedded)

Step 3: Define Mapping Rules
BladeBridge uses a rule-based translation engine. You or the BladeBridge team will:
- Map Redshift-specific functions (e.g., DISTINCT ON, ENCODE, STL_* system tables) to Databricks-compatible alternatives.
- Handle data type conversions (SUPER, GEOMETRY, etc. → struct/JSON or compatible formats)
- Replace Redshift-specific syntax with Spark/Databricks equivalents.

Step 4: Generate Target Code
- BladeBridge will generate:
1 Spark SQL / Databricks SQL scripts
2 Optional: PySpark or Scala code if procedural logic needs translation
3 Notebooks (.dbc or .ipynb)
4 dbt-compatible models (if configured)

Step 5: Validation & QA
- BladeBridge offers data diffing / validation capabilities to compare Redshift and Databricks output.
- Integrate with Great Expectations or Delta Live Tables expectations if needed.
- Unit tests and volume-based data checks are essential post-conversion.

Step 6: Deployment
- Load converted code into Databricks (via Workspace API, Git sync, or notebooks).
- Use Databricks Jobs or Workflows to orchestrate converted SQL pipelines.
- Set up access permissions if you're using Unity Catalog.

Ask BladeBridge for:
- Redshift → Databricks Conversion Mapping Guide
- Rule Engine Customization Manual
- CLI/SDK Usage Docs
- Check with your Databricks TAM (if enterprise) — they often co-pilot BladeBridge-based migrations.

My Suggestion:
If this is your first time, request BladeBridge to:
- Do a pilot migration of 10–20 complex querie
- Provide documentation for custom rules
- Clarify the translation logic visibility so you can tune it in-house later

LR

ddharma · 2 weeks ago

Dear @lingareddy_Alva ,

Thank you so much for sharing these steps & specifics. Much appreciated!

Context:

Have just started exploring BladeBridge for AWS Redshift to Databricks migration.

"BladeBridge operates as a code translation framework" and it supports and provides many other activities as part of the e2e data migration process.

The reconcile step in the official documentation here shows reconciliation between source and target.

But there is NO mention of actual data flow / data migration / Redshift Unload etc anywhere.

Question:

Is the actual data movement or I/O from source (Redshift) to target (Databricks) NOT handled by BladeBridge? OOB?

Thanks & Regards,

Dileep

Databricks Community

How to Use BladeBridge for Redshift to Databricks Migration?

Join Us as a Local Community Builder!

What's new in Databricks: July - August 2025

How to use Lakebase as a transactional data layer for Databricks Apps

🌟 Community Spark of the Week | Aug 22 – Aug 28 🌟

EMEA Learning Festival: Hands-on Learning Journey!

Virtual Learning Festival: 10 October - 31 October 2025