<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: The Open Source DLT Meta Framework in Community Articles</title>
    <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126785#M520</link>
    <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/85833"&gt;@sridharplv&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 29 Jul 2025 10:38:03 GMT</pubDate>
    <dc:creator>RiyazAliM</dc:creator>
    <dc:date>2025-07-29T10:38:03Z</dc:date>
    <item>
      <title>The Open Source DLT Meta Framework</title>
      <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/125926#M486</link>
      <description>&lt;P&gt;DLT Meta is an open-source framework developed by Databricks Labs that enables the automation of bronze and silver data pipelines through metadata configuration rather than manual code development.&lt;/P&gt;&lt;P&gt;At its core, the framework uses a &lt;STRONG&gt;Dataflowspec&lt;/STRONG&gt; - a JSON-based specification file that contains all the metadata needed to define source connections, target schemas, data quality rules, and transformation logic.&lt;/P&gt;&lt;P&gt;A high level process flow is depicted below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_1-1753154791734.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18330i2597D307C3E865E8/image-size/large?v=v2&amp;amp;px=999" role="button" title="aayrm5_1-1753154791734.png" alt="aayrm5_1-1753154791734.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How DLT Meta Works: The framework operates through three key components:&lt;/P&gt;&lt;H4&gt;&lt;STRONG&gt;1. Onboarding JSON (Dataflowspec):&lt;/STRONG&gt;&lt;/H4&gt;&lt;P&gt;This metadata files defines the source details, source format, bronze, silver, and gold table details along with their storage location (catalog &amp;amp; schema).&lt;/P&gt;&lt;P&gt;Example:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;{
  "tables": [
    {
      "source_format": "cloudFiles",
      "source_details": {
        "source_path": "/path/to/source",
        "source_schema_path": "/path/to/schema"
      },
      "target_format": "delta",
      "target_details": {
        "database": "bronze_db",
        "table": "customer_data"
      }
    }
  ]
}&lt;/LI-CODE&gt;&lt;H4&gt;&lt;STRONG&gt;2. Data Quality Expectations:&lt;/STRONG&gt;&lt;/H4&gt;&lt;P class=""&gt;This is a separate JSON files that defines quality rules to be applied to the bronze and bronze quarantine tables:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;{
   "expect_or_drop": {
      "no_rescued_data": "_rescued_data IS NULL",
      "valid_id": "id IS NOT NULL",
      "valid_operation": "operation IN ('APPEND', 'DELETE', 'UPDATE')"
   },
   "expect_or_quarantine": {
      "quarantine_rule": "_rescued_data IS NOT NULL OR id IS NULL OR operation IS NULL"
   }
}&lt;/LI-CODE&gt;&lt;H4&gt;&lt;STRONG&gt;3. Silver Transformations&lt;/STRONG&gt;&lt;/H4&gt;&lt;P class=""&gt;Business logic transformations defined as SQL to be applied on bronze to create the silver layer:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;[
  {
    "target_table": "customers_silver",
    "select_exp": [
      "address",
      "email",
      "firstname",
      "id",
      "lastname",
      "operation_date",
      "operation",
      "_rescued_data"
    ]
  },
  {
    "target_table": "transactions_silver",
    "select_exp": [
      "id",
      "customer_id",
      "amount",
      "item_count",
      "operation_date",
      "operation",
      "_rescued_data"
    ]
  }
]&lt;/LI-CODE&gt;&lt;P&gt;Once you have all of the JSONs created, you can deploy these json to create a Spec Table using the onboard data flow spec script in the src folder. I've created a onboarding job to pass the parameters which would be passed to the notebooks via dbutils widgets.&lt;/P&gt;&lt;P&gt;The notebook would look like below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_3-1753155396659.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18332i72C54A64925307B2/image-size/large?v=v2&amp;amp;px=999" role="button" title="aayrm5_3-1753155396659.png" alt="aayrm5_3-1753155396659.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The parameters passed to the onboarding job is as follows:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_4-1753155477157.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18333iB901A61CA9EFCB58/image-size/medium?v=v2&amp;amp;px=400" role="button" title="aayrm5_4-1753155477157.png" alt="aayrm5_4-1753155477157.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Once the onboarding job runs successfully, you'd have bronze, silver, and gold spec tables that your DLT Job would take them as configurations.&lt;/P&gt;&lt;P&gt;The typical process would look like below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_2-1753155019763.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18331i665B098CAAAB93FA/image-size/large?v=v2&amp;amp;px=999" role="button" title="aayrm5_2-1753155019763.png" alt="aayrm5_2-1753155019763.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Let's proceed to create the DLT pipeline to execute our medallian flow defined in the onboarding json and stored in the spec tables.&lt;/P&gt;&lt;P&gt;The JSON config to create the DLT pipeline is as follows:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;{
    "pipeline_type": "WORKSPACE",
    "clusters": [
        {
            "label": "default",
            "node_type_id": "Standard_D3_v2",
            "driver_node_type_id": "Standard_D3_v2",
            "num_workers": 1
        }
    ],
    "development": true,
    "continuous": false,
    "channel": "CURRENT",
    "photon": false,
    "libraries": [
        {
            "notebook": {
                "path": "path/to/the/dlt_meta_notebook"
            }
        }
    ],
    "name": "your_dlt_pipeline_name",
    "edition": "ADVANCED",
    "catalog": "catalog_name",
    "configuration": {
        "layer": "bronze_silver_gold",
        "bronze.dataflowspecTable": "&amp;lt;bronze_spec_table_details&amp;gt;",
        "bronze.group": "&amp;lt;dataflow_group_defined_in_the_onboarding&amp;gt;",
        "silver.dataflowspecTable": "&amp;lt;silver_spec_table_details&amp;gt;",
        "silver.group": "&amp;lt;dataflow_group_defined_in_the_onboarding&amp;gt;",
        "gold.dataflowspecTable": "&amp;lt;gold_spec_table_details&amp;gt;",
        "gold.group": "&amp;lt;dataflow_group_defined_in_the_onboarding&amp;gt;",
    },
    "schema": "&amp;lt;schema_name&amp;gt;"
}&lt;/LI-CODE&gt;&lt;P&gt;The layer - bronze_silver_gold will trigger all the tables available in the 3 layers defined in the spec tables.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The dlt_meta_notebook defined in the source code is shown below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_5-1753157095037.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18334i9A1B4A74492ADDB3/image-size/large?v=v2&amp;amp;px=999" role="button" title="aayrm5_5-1753157095037.png" alt="aayrm5_5-1753157095037.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When you finally start the pipeline, it will request resources from the cloud provider (or Databricks if its serverless) and initiate the DAG for your pipeline.&lt;/P&gt;&lt;P&gt;The DAG for my usecase looks like below, which is a combination of both streaming tables and materialized views:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="aayrm5_6-1753157194894.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18336i4CB224F570FE7F3E/image-size/large?v=v2&amp;amp;px=999" role="button" title="aayrm5_6-1753157194894.png" alt="aayrm5_6-1753157194894.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;If you want to check this out your-selves, take a look at the Databricks Labs GitHub link:&amp;nbsp;&lt;A href="https://github.com/databrickslabs/dlt-meta" target="_self"&gt;https://github.com/databrickslabs/dlt-meta&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Please let me know if you have any questions. Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 04:09:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/125926#M486</guid>
      <dc:creator>RiyazAliM</dc:creator>
      <dc:date>2025-07-22T04:09:03Z</dc:date>
    </item>
    <item>
      <title>Re: The Open Source DLT Meta Framework</title>
      <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/125970#M487</link>
      <description>&lt;P&gt;Great breakdown of DLT Meta’s architecture and process flow. Thanks for sharing,&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/15469"&gt;@RiyazAliM&lt;/a&gt;!&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 09:33:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/125970#M487</guid>
      <dc:creator>Advika</dc:creator>
      <dc:date>2025-07-22T09:33:08Z</dc:date>
    </item>
    <item>
      <title>Re: The Open Source DLT Meta Framework</title>
      <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126001#M491</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/152834"&gt;@Advika&lt;/a&gt;&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 14:19:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126001#M491</guid>
      <dc:creator>RiyazAliM</dc:creator>
      <dc:date>2025-07-22T14:19:17Z</dc:date>
    </item>
    <item>
      <title>Re: The Open Source DLT Meta Framework</title>
      <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126249#M497</link>
      <description>&lt;P&gt;Great Article Riyaz. keep Sharing more knowledge&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jul 2025 18:38:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126249#M497</guid>
      <dc:creator>sridharplv</dc:creator>
      <dc:date>2025-07-23T18:38:53Z</dc:date>
    </item>
    <item>
      <title>Re: The Open Source DLT Meta Framework</title>
      <link>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126785#M520</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/85833"&gt;@sridharplv&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Jul 2025 10:38:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/the-open-source-dlt-meta-framework/m-p/126785#M520</guid>
      <dc:creator>RiyazAliM</dc:creator>
      <dc:date>2025-07-29T10:38:03Z</dc:date>
    </item>
  </channel>
</rss>

