Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a month ago
You can seamlessly execute the things done via UI in the DABs. You can configure your multi table Lake flow pipelines using YAML configuration if you prefer configuration to ensure reproducibility. More details for Post gre sql ingestion here
You can manage Lakeflow Connect pipelines as code using Asset Bundles for sql server by adding few files like below and use similar approach for other databases
- Workflow file that controls the frequency of data ingestion (sqlserver.yml).
variables:
# Common variables used multiple places in the DAB definition.
gateway_name:
default: sqlserver01-gateway-pipeline
dest_catalog:
default: main
dest_schema:
default: sqlserver01
resources:
pipelines:
gateway:
name: ${var.gateway_name}
gateway_definition:
connection_name: rebel
gateway_storage_catalog: main
gateway_storage_schema: sqlserver01
gateway_storage_name: sqlserver01-gateway-pipeline
catalog: main
target: sqlserver01
pipeline_sqlserver:
name: sqlserver-ingestion-pipeline
ingestion_definition:
ingestion_gateway_id: ${resources.pipelines.gateway.id}
objects:
- schema:
# Ingest all tables in the sqlserver01.dbo schema to main.dest_schema. The destination table name will be drivers, the same as it is on the source.
source_catalog: sqlserver01
source_schema: dbo
destination_catalog: main
destination_schema: sqlserver01
target: sqlserver01
catalog: main- Pipeline Job definition file (sqlserver_pipeline.yml).
resources:
jobs:
sqlserver_dab_job:
name: sqlserver-ingestion-pipeline job
trigger:
periodic:
interval: 8
unit: HOURS
email_notifications:
on_failure:
- user email
tasks:
- task_key: refresh_pipeline
pipeline_task:
pipeline_id: ${resources.pipelines.pipeline_sqlserver.id}