Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2024 08:35 AM
@jorperort thanks for your question!
To implement the Write-Audit-Publish (WAP) pattern in Databricks using Unity Catalog, workflows, and SQL notebooks, follow these steps:
- Set Up Unity Catalog: Configure Unity Catalog for unified governance across data assets.
- Create Schemas: Use separate schemas for staging and production to manage data lifecycle:
CREATE SCHEMA IF NOT EXISTS staging;CREATE SCHEMA IF NOT EXISTS production;
- Develop SQL Notebooks: Write SQL notebooks for ingestion, validation, and transformation tasks:
- Ingest data into staging.
- Validate and transform the data.
- Publish to production.
- Automate with Workflows: Set up Databricks Workflows to automate notebook execution in sequence: ingest, validate, transform, and publish.
- Follow WAP Steps:
- Write: Load raw data into the staging schema.
- Audit: Validate and transform data within staging.
- Publish: Move validated data to production.