<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DLT - runtime parameterisation of execution in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/65028#M32727</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;Can you please provide some reference to REST API approach? I do not see that available on the docs.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;TIA&lt;/P&gt;</description>
    <pubDate>Fri, 29 Mar 2024 19:00:44 GMT</pubDate>
    <dc:creator>data-engineer-d</dc:creator>
    <dc:date>2024-03-29T19:00:44Z</dc:date>
    <item>
      <title>DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/63603#M32292</link>
      <description>&lt;P&gt;I have started to use DLT in a prototype framework and I now face the below challenge for which any help would be appreciated.&lt;/P&gt;&lt;P&gt;First let me give a brief context:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have metadata sitting in a .json file that I read as the first task and put it into a log table with all the relevant attributes (including the list of tables to be processed by the DLT pipeline)&lt;/LI&gt;&lt;LI&gt;That log table has multiple records including those of past executions so I have to filter it down to the current one using a timestamp (e.g. IngestAdventureWorks_20240314)&lt;/LI&gt;&lt;LI&gt;For that I need to pass that ID as a parameter to the DLT pipeline so it can be used in a SQL query to find the relevant records and built the list of tables to be processed.&lt;/LI&gt;&lt;LI&gt;When I hardcode it as a Key-Value pair during design-time I can access those values easily using the&amp;nbsp;&lt;STRONG&gt;spark.conf.get("ID", &amp;nbsp;None&lt;/STRONG&gt;&lt;SPAN&gt;&lt;STRONG&gt;)&lt;/STRONG&gt; syntax&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;My question/challenge is how to pass that parameter using either a task in a workflow (similarly how I can reference prior tasks' output and pass it to a widget in a downstream notebook task) or execute the DLT pipeline using a notebook.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;That would be really important for me to make the solution really dynamic without hardcoding parameter values.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Thanks for any help in advance&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;István&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 13 Mar 2024 19:24:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/63603#M32292</guid>
      <dc:creator>MartinIsti</dc:creator>
      <dc:date>2024-03-13T19:24:43Z</dc:date>
    </item>
    <item>
      <title>Re: DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/63930#M32404</link>
      <description>&lt;P&gt;Thanks Kaniz to your response. It would have been great to use a similar approach like the widgets to a normal notebook. Specifying these parameters at design time does not allow the flexibility needed for running my DLT pipeline truly metadata-driven.&lt;/P&gt;&lt;P&gt;I was also going towards using the job REST API from a notebook but then I ended up tweaking my configuration tables in a way that I can utilise a hardcoded parameter in the DLT definition and still have it dynamic.&lt;/P&gt;&lt;P&gt;If the REST API call functionality could be integrated into the workflows later on to pass these values as to other tasks, that would be really great!&lt;/P&gt;&lt;P&gt;I accept it as a solution because your third suggestion would work. I still keep hoping a more integrated approach will come in the future &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 17 Mar 2024 20:25:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/63930#M32404</guid>
      <dc:creator>MartinIsti</dc:creator>
      <dc:date>2024-03-17T20:25:43Z</dc:date>
    </item>
    <item>
      <title>Re: DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/65028#M32727</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;Can you please provide some reference to REST API approach? I do not see that available on the docs.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;TIA&lt;/P&gt;</description>
      <pubDate>Fri, 29 Mar 2024 19:00:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/65028#M32727</guid>
      <dc:creator>data-engineer-d</dc:creator>
      <dc:date>2024-03-29T19:00:44Z</dc:date>
    </item>
    <item>
      <title>Re: DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/84303#M37180</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42628"&gt;@MartinIsti&lt;/a&gt;&amp;nbsp;, How did you manage tweaking the metadata to handle dynamically. Can you pls brief it out based on what you told is the below.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"I ended up tweaking my configuration tables in a way that I can utilise a hardcoded parameter in the DLT definition and still have it dynamic."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Aug 2024 10:10:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/84303#M37180</guid>
      <dc:creator>Vamshikrishna_r</dc:creator>
      <dc:date>2024-08-27T10:10:56Z</dc:date>
    </item>
    <item>
      <title>Re: DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/85057#M37228</link>
      <description>&lt;P&gt;Sure, and for the record I'm still not fully happy with how parameters need to be set at design time.&lt;/P&gt;&lt;P&gt;As mentioned, I store the metadata in a .json file that I read using a standard notebook. The content of that I then save into DBFS as a delta table overwriting any previous version. Then the DLT notebook reads from that table and I only need to specify the name of the process (e.g. IngestAdventureWorks) and that name matches the name of the DLT pipeline itself (or it can be derived).&lt;/P&gt;&lt;P&gt;Once I determine which table to read from the DLT pipeline can be driven by the metadata in that table.&lt;/P&gt;&lt;P&gt;I still find dealing with DLTs inconsistent with orchestration of standard notebook-driven data handling, it is an odd-one out that mostly needs a slightly different way of handling but so far I have found a workaround for every of these small inconsistencies.&lt;/P&gt;</description>
      <pubDate>Tue, 27 Aug 2024 21:50:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/85057#M37228</guid>
      <dc:creator>MartinIsti</dc:creator>
      <dc:date>2024-08-27T21:50:01Z</dc:date>
    </item>
    <item>
      <title>Re: DLT - runtime parameterisation of execution</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/86149#M37293</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42628"&gt;@MartinIsti&lt;/a&gt;&amp;nbsp;thanks for your detailed explanation.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Aug 2024 05:42:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-runtime-parameterisation-of-execution/m-p/86149#M37293</guid>
      <dc:creator>Vamshikrishna_r</dc:creator>
      <dc:date>2024-08-29T05:42:42Z</dc:date>
    </item>
  </channel>
</rss>

