<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to Create Metadata driven Data Pipeline in Databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-create-metadata-driven-data-pipeline-in-databricks/m-p/127488#M47984</link>
    <description>&lt;P&gt;I am creating a Data Pipeline as shown below.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Pratikmsbsvm_0-1754408926145.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18742iA4A6ED179D239E39/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Pratikmsbsvm_0-1754408926145.png" alt="Pratikmsbsvm_0-1754408926145.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;1. Files from multiple input source is coming to respective folder in bronze layer.&lt;/P&gt;&lt;P&gt;2. Using Databricks to perform Transformation and load transformed data to Azure SQL. also to ADLS Gen2 Silver (not shown in figure).&lt;/P&gt;&lt;P&gt;How to write pyspark code which can handle multiple folder as well multiple files to read and transformed through metadata table.&lt;/P&gt;&lt;P&gt;I want to control execution of code through Metadata table, is there any other way to parameterized it.&lt;/P&gt;&lt;P&gt;also will it be possible to do schema validation with metadata table approach.&lt;/P&gt;&lt;P&gt;Please help.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Pardon me if it sound unrealistic.&lt;/P&gt;&lt;P&gt;Thanks a lot&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 05 Aug 2025 15:57:23 GMT</pubDate>
    <dc:creator>Pratikmsbsvm</dc:creator>
    <dc:date>2025-08-05T15:57:23Z</dc:date>
    <item>
      <title>How to Create Metadata driven Data Pipeline in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-create-metadata-driven-data-pipeline-in-databricks/m-p/127488#M47984</link>
      <description>&lt;P&gt;I am creating a Data Pipeline as shown below.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Pratikmsbsvm_0-1754408926145.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/18742iA4A6ED179D239E39/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Pratikmsbsvm_0-1754408926145.png" alt="Pratikmsbsvm_0-1754408926145.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;1. Files from multiple input source is coming to respective folder in bronze layer.&lt;/P&gt;&lt;P&gt;2. Using Databricks to perform Transformation and load transformed data to Azure SQL. also to ADLS Gen2 Silver (not shown in figure).&lt;/P&gt;&lt;P&gt;How to write pyspark code which can handle multiple folder as well multiple files to read and transformed through metadata table.&lt;/P&gt;&lt;P&gt;I want to control execution of code through Metadata table, is there any other way to parameterized it.&lt;/P&gt;&lt;P&gt;also will it be possible to do schema validation with metadata table approach.&lt;/P&gt;&lt;P&gt;Please help.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Pardon me if it sound unrealistic.&lt;/P&gt;&lt;P&gt;Thanks a lot&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 15:57:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-create-metadata-driven-data-pipeline-in-databricks/m-p/127488#M47984</guid>
      <dc:creator>Pratikmsbsvm</dc:creator>
      <dc:date>2025-08-05T15:57:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create Metadata driven Data Pipeline in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-create-metadata-driven-data-pipeline-in-databricks/m-p/127491#M47985</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/143693"&gt;@Pratikmsbsvm&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;It's totally realistic requirement. In fact you can find many articles that suggests some approaches how to design such control table.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Take for example following article:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://medium.com/dbsql-sme-engineering/a-primer-for-metadata-driven-frameworks-with-databricks-workflows-and-sql-b2c8a738d2d5" target="_blank" rel="noopener"&gt;https://medium.com/dbsql-sme-engineering/a-primer-for-metadata-driven-frameworks-with-databricks-workflows-and-sql-b2c8a738d2d5&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Or this one:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.databricks.com/t5/technical-blog/metadata-driven-etl-framework-in-databricks-part-1/ba-p/92666" target="_blank" rel="noopener"&gt;https://community.databricks.com/t5/technical-blog/metadata-driven-etl-framework-in-databricks-part-1/ba-p/92666&lt;/A&gt;&lt;/P&gt;&lt;P&gt;There also exists DLT metada-driven framework that you can try for free:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/databrickslabs/dlt-meta" target="_blank" rel="noopener"&gt;https://github.com/databrickslabs/dlt-meta&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 17:14:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-create-metadata-driven-data-pipeline-in-databricks/m-p/127491#M47985</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-08-05T17:14:45Z</dc:date>
    </item>
  </channel>
</rss>

