- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2024 08:30 AM
Good afternoon,
I am looking for documentation to implement the WAP pattern using Unity Catalog, workflows, SQL notebooks, and any other services necessary to use this pattern. Could you share information on how to approach the problem with documentation, a practical case, or any other pattern or best practices I should consider? Perhaps separating a staging schema, for example.
Best regards, and thank you in advance for your help.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2024 08:35 AM
@jorperort thanks for your question!
To implement the Write-Audit-Publish (WAP) pattern in Databricks using Unity Catalog, workflows, and SQL notebooks, follow these steps:
- Set Up Unity Catalog: Configure Unity Catalog for unified governance across data assets.
- Create Schemas: Use separate schemas for staging and production to manage data lifecycle:
CREATE SCHEMA IF NOT EXISTS staging;
CREATE SCHEMA IF NOT EXISTS production;
- Develop SQL Notebooks: Write SQL notebooks for ingestion, validation, and transformation tasks:
- Ingest data into staging.
- Validate and transform the data.
- Publish to production.
- Automate with Workflows: Set up Databricks Workflows to automate notebook execution in sequence: ingest, validate, transform, and publish.
- Follow WAP Steps:
- Write: Load raw data into the staging schema.
- Audit: Validate and transform data within staging.
- Publish: Move validated data to production.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2024 08:35 AM
@jorperort thanks for your question!
To implement the Write-Audit-Publish (WAP) pattern in Databricks using Unity Catalog, workflows, and SQL notebooks, follow these steps:
- Set Up Unity Catalog: Configure Unity Catalog for unified governance across data assets.
- Create Schemas: Use separate schemas for staging and production to manage data lifecycle:
CREATE SCHEMA IF NOT EXISTS staging;
CREATE SCHEMA IF NOT EXISTS production;
- Develop SQL Notebooks: Write SQL notebooks for ingestion, validation, and transformation tasks:
- Ingest data into staging.
- Validate and transform the data.
- Publish to production.
- Automate with Workflows: Set up Databricks Workflows to automate notebook execution in sequence: ingest, validate, transform, and publish.
- Follow WAP Steps:
- Write: Load raw data into the staging schema.
- Audit: Validate and transform data within staging.
- Publish: Move validated data to production.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2024 08:44 AM - edited 12-30-2024 08:45 AM
Hi @jorperort ,
Apart from nice step by step instruction that @VZLA has provided, you can also take a look at short presentation of WAP pattern at the official databricks YT channel:
https://youtu.be/4K3zAmUgViE?t=492

