cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Lakebridge Synapse Conversion to DBX and Custom transpiler

viralpatel
New Contributor II

I have 2 questions about Lakebridge solution,

  • Synapse with dedicated pool Conversion
    • We were conducting a PoC for Synapse to DBX migration using Lakebridge. What we have observed is that the conversions are not correct. I was anticipating all tables will be converted fine at least but that is not the case. See below source and converted outputs. For complex stored procedures to notebooks I expected manual conversion but you guys can correct me.
Synapse:
CREATE TABLE [extra1].[dimension_City2] (
    [City Key]                   INT            NOT NULL,
    [WWI City ID]                INT            NULL,
    [City]                       NVARCHAR (255) NULL,
    [Latest Recorded Population] BIGINT         NULL,
    [Valid From]                 DATETIME2 (7)  NULL,
    [Valid To]                   DATETIME2 (7)  NULL
)
WITH (CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = HASH([City Key]));


GO

CREATE OR REPLACE TABLE `extra1`.`dimension_City2` (
    `City Key`                   INT            NOT NULL,
    `WWI City ID`                INT,
    `City` STRING,
    `Latest Recorded Population` BIGINT,
    `Valid From`                 TIMESTAMP  ,
    `Valid To`                   TIMESTAMP  
)
WITH(CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = HASH(`City Key`));

​
  • Note: for source_dialect we have tried both mssql and synapse. Sharing one of the config generated post following the steps.
catalog_name: remorph
error_file_path: ./another_try/errors.log
input_source: ./mssql/DatabaseProjectsqlpool-dwh
output_folder:./another_try/output
schema_name: transpiler
skip_validation: true
source_dialect: mssql
transpiler_config_path: ./.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml
transpiler_options:
  overrides-file: Bladebridge
version: 3
  • Custom Transpiler: I was thinking of building a custom transpiler setup. Can you share where I can find documentation on how to start building with it and what I need to take into account.

Documentation referred from official link

2 REPLIES 2

szymon_dybczak
Esteemed Contributor III

Hi @viralpatel ,

Yep, Lakebridge is not perfect so far. There are several open issues at github related to Synapse and Lakebridge.
Regarding second question, I guess you can try to build your custom transpiler setup. The code is open sourced and available below:

lakebridge/src/databricks/labs/lakebridge/transpiler at main · databrickslabs/lakebridge

But before you do that, maybe take a look at customizing bladebridge transpiler with your own custom rules?
Bladebridge transpiler relies heavily on rules defined inside configuration files provided with the converter. These configurations are comprised of a set of layered json files and code templates that drive the generation of output files and application of conversion rules. 
So maybe you will be able to improve quality of translation by defining your own set of custom rules (which should be a lot easier than crafting your own transpiler from scratch)

lakebridge/docs/lakebridge/docs/transpile/pluggable_transpilers/bladebridge_configuration.mdx at mai...

Also, you can rais an issue and describe this bug here:

Issues · databrickslabs/lakebridge

yourssanjeev
New Contributor II

We are also checking on this use case but got it confirmed from Databricks that it does not work for this use case yet, not sure whether it is in their roadmap

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now