cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

IBM DataStage to Databricks Migration

Hari_P
Databricks Partner

Hi All,

We are currently exploring a use case involving migration from IBM DataStage to Databricks. I noticed that LakeBridge supports automated code conversion for this process. If anyone has experience using LakeBridge, could you please share any best practices or lessons learned?


Additionally, I would appreciate any insights on the business value proposition—particularly regarding potential cost savings—associated with moving from IBM DataStage to Databricks.
Thank you in advance for your input.

11 REPLIES 11

SebastianRowan
Contributor

Lakebridge can automate much of the ETL conversion, speeding migration and reducing labor and runtime costs.

Hari_P
Databricks Partner

Thank you for your response, do you know if there is any documentation on how much of it converts and what are the limitations in the conversion etc? 

Echoes
New Contributor II

Hi @Hari_P , Have you finished the migration? Whick tool did you use?

Kevin8
New Contributor III

Hi @Echoes  @Hari_P  @SebastianRowan  you can you Travinto technologies tool, their conversion ratio is 95-100%.

thelogicplus
Contributor II

Hi @Kevin8  check the articles on thelogicplus they have very good article and automation help  for datastage to databricks migration

pradeep_singh
Contributor III

Lakebridge does a decent job, but it won’t fix all your conversion problems. It can achieve about 70–80% conversion accuracy, depending on the complexity of your jobs. The default configuration it uses is designed for the most common scenarios and patterns, so it may not handle some of the patterns found in the DataStage jobs you plan to convert.
However, it does provide a strong starting point, especially since it can successfully convert large jobs where other LLMs fail due to context limitations. You’ll still need to clean up the converted code, test it, fix issues, and apply patches for anything it missed. Be sure to allocate extra time and effort for refactoring the converted code.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

pradeep_singh
Contributor III

If you want to get really nerdy and fix lakebridge to get better accuracy you can build and supply your own custom configuration - 
https://databrickslabs.github.io/lakebridge/docs/transpile/pluggable_transpilers/bladebridge/bladebr...


Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

pradeep_singh
Contributor III

Or you can involved Lakebridge folks and ask them to help you with creating custom configuration based on your scenarios 

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

pradeep_singh
Contributor III

You can also use claude code pointing to lllm endpoints hosted on databricks to clean up the converted code or build modules lakebridge count convert or ask it to just summarize your datastage jobs etc. 
https://medium.com/@dbxdev/turning-databricks-into-an-ai-pair-programmer-with-claude-powered-coding-...

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

Hi @pradeep_singh  I have used Lakebridge but it convert only 25 to 30% jobs only. Due to that we suffer a lot but  when we use Travinto then we convert 80-95% of the jobs that save lot of time. 

thelogicplus
Contributor II

@pradeep_singh  check the ranking on this website best datastage to databricks migration tool