<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Jobs and Pipeline input parameter in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134011#M49987</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/188930"&gt;@Ritesh-Dhumne&lt;/a&gt;&amp;nbsp; &lt;span class="lia-unicode-emoji" title=":waving_hand:"&gt;👋&lt;/span&gt;,&lt;/P&gt;&lt;P&gt;Firstly, have you thought about moving across to the Free Edition instead of the Community Edition?&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/getting-started/free-edition" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/getting-started/free-edition&lt;/A&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Based on your query, here's a couple you could consider:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;1. If you're considering setting up a&amp;nbsp;&lt;STRONG&gt;Job&lt;/STRONG&gt; with two &lt;STRONG&gt;Tasks&lt;/STRONG&gt;, each as a notebook, then you could create a &lt;EM&gt;&lt;STRONG&gt;Job Parameter&lt;/STRONG&gt;&lt;/EM&gt; or a &lt;EM&gt;&lt;STRONG&gt;Task Parameter&lt;/STRONG&gt;&lt;/EM&gt;. I'd have a read up on these in the documentations, the docs are great:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/parameters" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/jobs/parameters&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_1-1759821809948.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/20511i1A4A253E20BC8C08/image-size/large?v=v2&amp;amp;px=999" role="button" title="BS_THE_ANALYST_1-1759821809948.png" alt="BS_THE_ANALYST_1-1759821809948.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;2. If you didn't want to create a job, you could make use of the&amp;nbsp;&lt;STRONG&gt;magic command&lt;/STRONG&gt;&amp;nbsp;&lt;U&gt;%run&lt;/U&gt; which would allow you to run another notebook from within your current notebook. You could configure some&amp;nbsp;&lt;STRONG&gt;Widgets&lt;/STRONG&gt; which basically are you parameters &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;.&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/notebooks/widgets#use-databricks-widgets-with-run" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/notebooks/widgets#use-databricks-widgets-with-run&lt;/A&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_0-1759821702054.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/20510iC6D5E0B12FE506E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="BS_THE_ANALYST_0-1759821702054.png" alt="BS_THE_ANALYST_0-1759821702054.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Feel free to ask any follow up questions &lt;span class="lia-unicode-emoji" title=":flexed_biceps:"&gt;💪&lt;/span&gt;.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on the project, it sounds really cool! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;. To level up the project, perhaps consider using Notebook 1 to create a Table rather than outputting to a file. This way, you can leverage the delta lake and have things like lineage and table history etc. Lots of cool features to explore! &lt;span class="lia-unicode-emoji" title=":grinning_face:"&gt;😀&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;All the best,&lt;BR /&gt;BS&lt;/P&gt;</description>
    <pubDate>Tue, 07 Oct 2025 07:34:50 GMT</pubDate>
    <dc:creator>BS_THE_ANALYST</dc:creator>
    <dc:date>2025-10-07T07:34:50Z</dc:date>
    <item>
      <title>Jobs and Pipeline input parameter</title>
      <link>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134006#M49986</link>
      <description>&lt;P&gt;I wanted to extract all files in the volume I have uploaded , in notebook 1 and then in notebook 2 perform basic transformation on every files like missing values , nulls , also I want to store the null , dirty records seperately and a clean dataframe seperately for all the files .In Community Edition in Jobs and PIpeline.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2025 05:05:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134006#M49986</guid>
      <dc:creator>Ritesh-Dhumne</dc:creator>
      <dc:date>2025-10-07T05:05:25Z</dc:date>
    </item>
    <item>
      <title>Re: Jobs and Pipeline input parameter</title>
      <link>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134011#M49987</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/188930"&gt;@Ritesh-Dhumne&lt;/a&gt;&amp;nbsp; &lt;span class="lia-unicode-emoji" title=":waving_hand:"&gt;👋&lt;/span&gt;,&lt;/P&gt;&lt;P&gt;Firstly, have you thought about moving across to the Free Edition instead of the Community Edition?&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/getting-started/free-edition" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/getting-started/free-edition&lt;/A&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Based on your query, here's a couple you could consider:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;1. If you're considering setting up a&amp;nbsp;&lt;STRONG&gt;Job&lt;/STRONG&gt; with two &lt;STRONG&gt;Tasks&lt;/STRONG&gt;, each as a notebook, then you could create a &lt;EM&gt;&lt;STRONG&gt;Job Parameter&lt;/STRONG&gt;&lt;/EM&gt; or a &lt;EM&gt;&lt;STRONG&gt;Task Parameter&lt;/STRONG&gt;&lt;/EM&gt;. I'd have a read up on these in the documentations, the docs are great:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/parameters" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/jobs/parameters&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_1-1759821809948.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/20511i1A4A253E20BC8C08/image-size/large?v=v2&amp;amp;px=999" role="button" title="BS_THE_ANALYST_1-1759821809948.png" alt="BS_THE_ANALYST_1-1759821809948.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;2. If you didn't want to create a job, you could make use of the&amp;nbsp;&lt;STRONG&gt;magic command&lt;/STRONG&gt;&amp;nbsp;&lt;U&gt;%run&lt;/U&gt; which would allow you to run another notebook from within your current notebook. You could configure some&amp;nbsp;&lt;STRONG&gt;Widgets&lt;/STRONG&gt; which basically are you parameters &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;.&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/notebooks/widgets#use-databricks-widgets-with-run" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/notebooks/widgets#use-databricks-widgets-with-run&lt;/A&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_0-1759821702054.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/20510iC6D5E0B12FE506E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="BS_THE_ANALYST_0-1759821702054.png" alt="BS_THE_ANALYST_0-1759821702054.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Feel free to ask any follow up questions &lt;span class="lia-unicode-emoji" title=":flexed_biceps:"&gt;💪&lt;/span&gt;.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on the project, it sounds really cool! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;. To level up the project, perhaps consider using Notebook 1 to create a Table rather than outputting to a file. This way, you can leverage the delta lake and have things like lineage and table history etc. Lots of cool features to explore! &lt;span class="lia-unicode-emoji" title=":grinning_face:"&gt;😀&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;All the best,&lt;BR /&gt;BS&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2025 07:34:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134011#M49987</guid>
      <dc:creator>BS_THE_ANALYST</dc:creator>
      <dc:date>2025-10-07T07:34:50Z</dc:date>
    </item>
    <item>
      <title>Re: Jobs and Pipeline input parameter</title>
      <link>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134029#M49991</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi&amp;nbsp;&lt;A href="https://community.databricks.com/t5/user/viewprofilepage/user-id/188930" target="_blank" rel="noopener"&gt;@Ritesh-Dhumne&lt;/A&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I'm assuming that you mistakenly named Free Edition as Community since you're using volumes which are not available in community edition.&lt;/P&gt;&lt;P&gt;I’m not sure if I’ve understood your approach correctly, but at first glance it seems incorrect - you can’t pass a DataFrame between tasks. What you can do is load all the files from the volume into a bronze table in Notebook1. You can use the special&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;_metadata&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;column to add information about the file_path from which each particular row originates. Here’s an example of how to use it:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="szymon_dybczak_0-1759826052216.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/20520i2A416FCE8F8F729C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="szymon_dybczak_0-1759826052216.png" alt="szymon_dybczak_0-1759826052216.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then, in Notebook2, you can apply your transformations based on this bronze table. You can count nulls, handle dirty data, and benefit from the fact that you can relate all these issues to a particular file, since this information is added to the bronze table through the _metadata special column.&lt;/P&gt;&lt;P&gt;From what I see you're in a learning process so I won't introduce the concept of autoloader which is pretty handy for ingestion of files &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2025 08:36:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jobs-and-pipeline-input-parameter/m-p/134029#M49991</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-10-07T08:36:47Z</dc:date>
    </item>
  </channel>
</rss>

