VZLA
Databricks Employee
Databricks Employee

@jorperort thanks for your question!

To implement the Write-Audit-Publish (WAP) pattern in Databricks using Unity Catalog, workflows, and SQL notebooks, follow these steps:

  1. Set Up Unity Catalog: Configure Unity Catalog for unified governance across data assets.
  2. Create Schemas: Use separate schemas for staging and production to manage data lifecycle:
    • CREATE SCHEMA IF NOT EXISTS staging;
    • CREATE SCHEMA IF NOT EXISTS production;
  3. Develop SQL Notebooks: Write SQL notebooks for ingestion, validation, and transformation tasks:
    • Ingest data into staging.
    • Validate and transform the data.
    • Publish to production.
  4. Automate with Workflows: Set up Databricks Workflows to automate notebook execution in sequence: ingest, validate, transform, and publish.
  5. Follow WAP Steps:
    • Write: Load raw data into the staging schema.
    • Audit: Validate and transform data within staging.
    • Publish: Move validated data to production.

View solution in original post