I've been working on data ingestion from SQL Server to UC using lakeflow connect. Lakeflow connect actually made the work easier when everything is right. I am trying to incorporate this with DAB and this would work fine with schema and table tags for 'object' in 'ingestion_definition'. What if someone wants to clean the table names before ingesting them into UC? For now, 'object' only allows 'report', 'table', 'schema'. If I create a python file or notebook that gets the table names from SQL Server using 'source_schema', and then clean/modify the table names, I would not be able to dynamically ingest tables. I understand that DAB doesn't allow runtime ingestion/modification of files, but I would love to have a way so that I can do something like below.
pipeline_sqlserver:
name: sqlserver-ingestion-pipeline
ingestion_definition:
ingestion_gateway_id: ${resources.pipelines.gateway.id}
objects: []
library:
file: .yml/.json/any other format
Link to Ingestion documentation: Ingest data from SQL Server | Databricks on AWS
Give objects an empty list and get the names from a yaml file or json file or any other format that tables can be saved into from preprocessed script.
If anyone had already faced the same issue and have a solution to it, I would greatly appreciate if you can share it here. Thanks!