<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Conditional Execution in DLT Pipeline based on the output in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/conditional-execution-in-dlt-pipeline-based-on-the-output/m-p/139237#M51121</link>
    <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;In Delta Live Tables (DLT), native conditional or branch-based control flow is limited; all table/stream definitions declared in your pipeline will execute, and dependencies are handled via &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table or &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.view decorators. You can’t dynamically skip or instantiate pipeline steps based purely on DataFrame content as you might in classic ETL tools—but there are best-practice patterns to approximate conditional logic for cases like threshold-based validation, especially with differing schemas.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Core Recommendation&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You should&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;separate your validation check from your log/error handling steps&lt;/STRONG&gt;, and use filtering and table dependencies to "gate" data to each path, rather than trying to switch between DataFrames with differing schemas in the same step.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Pattern: Split Tables and Data Filters&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Validation Table:&lt;/STRONG&gt;&lt;BR /&gt;Create a DLT table or view that applies your threshold check and only outputs rows passing the check.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Failure Logs Table:&lt;/STRONG&gt;&lt;BR /&gt;Create a separate DLT table or view that selects only rows failing your check, and transforms them into your failure schema. You can derive this from the same base input (or from the validation table, applying the inverse filter).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Downstream Consumption:&lt;/STRONG&gt;&lt;BR /&gt;Downstream steps will only see non-empty tables if the corresponding condition occurs. You can avoid writing zero-row tables by adding checks or limits in your final write logic, though DLT will still instantiate both tables as part of the pipeline graph.&lt;/P&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Example:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;import&lt;/SPAN&gt; dlt

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;raw_data&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; spark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;table&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"source_table"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;passed_data&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; dlt&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"raw_data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;filter&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;metric &lt;SPAN class="token token operator"&gt;&amp;gt;&lt;/SPAN&gt; threshold&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;failed_logs&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; dlt&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"raw_data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  failed &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;filter&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;metric &lt;SPAN class="token token operator"&gt;&amp;lt;=&lt;/SPAN&gt; threshold&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;# Map failure schema&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; failed&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;selectExpr&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"metric"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"logMessage"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This way, only the appropriate table receives data per run, but both destinations are defined in your graph.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Avoiding Zero-Row Tables&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Optional:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Use a final step to avoid writing downstream if a table has no data. DLT will materialize the step, but nothing will be written or processed downstream if the table/view is empty.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This does not prevent DLT from managing the table, but can keep your metrics cleaner.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Why Not Dynamic Branching?&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;DLT's declarative framework builds graphs at pipeline start time—you can't declare pipeline steps dynamically. However,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;by using DataFrame filters and data-dependent table population&lt;/STRONG&gt;, you mimic conditional behavior within the limitations of the framework.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Handling Different Schemas&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Never return different schema DataFrames from the same output step&lt;/STRONG&gt;. Instead, output to separately-defined tables/views, each with a fixed schema, so DLT's schema expectations are satisfied. This keeps the pipeline reliable and clear.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Summary Table&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Requirement&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;DLT Solution&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Notes&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Conditional logic&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Data filters&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Split via filtered tables/views&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Different schemas&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Distinct tables/views&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Never change schema mid-pipeline&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Output only logs on fail&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Logs table populated only if failed rows exist&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;DLT materializes the table regardless; avoid zero-result processing downstream&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;DIV class="bg-base border-subtler shadow-subtle pointer-coarse:opacity-100 right-xs absolute bottom-0 flex rounded-lg border opacity-0 transition-opacity group-hover:opacity-100 [&amp;amp;&amp;gt;*:not(:first-child)]:border-subtle [&amp;amp;&amp;gt;*:not(:first-child)]:border-l"&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;References&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Databricks forums and official docs recommend this pattern for validations, error handling, and conditional branches in DLT.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If you need nested conditional logic or true "branching," you might need to orchestrate pipelines via external tools (Databricks Jobs, Airflow, etc.) or preprocess thresholds before ingesting to DLT. For most operational use cases, filtering/gating via tables as described above is the validated approach.&lt;/P&gt;</description>
    <pubDate>Sun, 16 Nov 2025 17:48:23 GMT</pubDate>
    <dc:creator>mark_ott</dc:creator>
    <dc:date>2025-11-16T17:48:23Z</dc:date>
    <item>
      <title>Conditional Execution in DLT Pipeline based on the output</title>
      <link>https://community.databricks.com/t5/data-engineering/conditional-execution-in-dlt-pipeline-based-on-the-output/m-p/84172#M37150</link>
      <description>&lt;P&gt;Hello ,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I'm working on a Delta Live Tables (DLT) pipeline where I need to implement a conditional step that only triggers under specific conditions. Here's the challenge I'm facing:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have a function that checks if the data meets certain thresholds. If the data passes, the function returns the original DataFrame. If the thresholds are breached, it returns a different DataFrame containing logs or failure statistics.&lt;/LI&gt;&lt;LI&gt;The main challenge is that these two DataFrames have different schemas, which is causing difficulties in the pipeline&lt;/LI&gt;&lt;LI&gt;I only want to initiate the logs-related steps in DLT when the threshold check fails. Currently, my pipeline writes two outputs regardless: one for the logs and one for the passed data, even if the passed data count is zero. This isn't the behavior I want.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Question:&lt;/STRONG&gt;&lt;BR /&gt;Is there a way to structure the DLT pipeline so that the logs path graph or process, is only initiated when the threshold check fails? Ideally, l'm looking for a nested or conditional DLT step that only runs when the threshold validation fails.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The fact that DLT doesn't have built-in flow control mechanism like ETL tools, is challenging.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;BR /&gt;Any guidance or best practices for achieving this would be greatly appreciated!&lt;BR /&gt;Thanks in advance for your help!&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 25 Aug 2024 21:35:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/conditional-execution-in-dlt-pipeline-based-on-the-output/m-p/84172#M37150</guid>
      <dc:creator>rpilli</dc:creator>
      <dc:date>2024-08-25T21:35:47Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Execution in DLT Pipeline based on the output</title>
      <link>https://community.databricks.com/t5/data-engineering/conditional-execution-in-dlt-pipeline-based-on-the-output/m-p/139237#M51121</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;In Delta Live Tables (DLT), native conditional or branch-based control flow is limited; all table/stream definitions declared in your pipeline will execute, and dependencies are handled via &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table or &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.view decorators. You can’t dynamically skip or instantiate pipeline steps based purely on DataFrame content as you might in classic ETL tools—but there are best-practice patterns to approximate conditional logic for cases like threshold-based validation, especially with differing schemas.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Core Recommendation&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You should&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;separate your validation check from your log/error handling steps&lt;/STRONG&gt;, and use filtering and table dependencies to "gate" data to each path, rather than trying to switch between DataFrames with differing schemas in the same step.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Pattern: Split Tables and Data Filters&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Validation Table:&lt;/STRONG&gt;&lt;BR /&gt;Create a DLT table or view that applies your threshold check and only outputs rows passing the check.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Failure Logs Table:&lt;/STRONG&gt;&lt;BR /&gt;Create a separate DLT table or view that selects only rows failing your check, and transforms them into your failure schema. You can derive this from the same base input (or from the validation table, applying the inverse filter).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Downstream Consumption:&lt;/STRONG&gt;&lt;BR /&gt;Downstream steps will only see non-empty tables if the corresponding condition occurs. You can avoid writing zero-row tables by adding checks or limits in your final write logic, though DLT will still instantiate both tables as part of the pipeline graph.&lt;/P&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Example:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;import&lt;/SPAN&gt; dlt

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;raw_data&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; spark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;table&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"source_table"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;passed_data&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; dlt&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"raw_data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;filter&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;metric &lt;SPAN class="token token operator"&gt;&amp;gt;&lt;/SPAN&gt; threshold&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

&lt;SPAN class="token token decorator annotation punctuation"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token decorator annotation punctuation"&gt;table&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;failed_logs&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
  df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; dlt&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"raw_data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  failed &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;filter&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;metric &lt;SPAN class="token token operator"&gt;&amp;lt;=&lt;/SPAN&gt; threshold&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;# Map failure schema&lt;/SPAN&gt;
  &lt;SPAN class="token token"&gt;return&lt;/SPAN&gt; failed&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;selectExpr&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"id"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"metric"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"logMessage"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This way, only the appropriate table receives data per run, but both destinations are defined in your graph.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Avoiding Zero-Row Tables&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Optional:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Use a final step to avoid writing downstream if a table has no data. DLT will materialize the step, but nothing will be written or processed downstream if the table/view is empty.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This does not prevent DLT from managing the table, but can keep your metrics cleaner.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Why Not Dynamic Branching?&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;DLT's declarative framework builds graphs at pipeline start time—you can't declare pipeline steps dynamically. However,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;by using DataFrame filters and data-dependent table population&lt;/STRONG&gt;, you mimic conditional behavior within the limitations of the framework.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Handling Different Schemas&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Never return different schema DataFrames from the same output step&lt;/STRONG&gt;. Instead, output to separately-defined tables/views, each with a fixed schema, so DLT's schema expectations are satisfied. This keeps the pipeline reliable and clear.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Summary Table&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Requirement&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;DLT Solution&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Notes&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Conditional logic&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Data filters&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Split via filtered tables/views&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Different schemas&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Distinct tables/views&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Never change schema mid-pipeline&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Output only logs on fail&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Logs table populated only if failed rows exist&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;DLT materializes the table regardless; avoid zero-result processing downstream&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;DIV class="bg-base border-subtler shadow-subtle pointer-coarse:opacity-100 right-xs absolute bottom-0 flex rounded-lg border opacity-0 transition-opacity group-hover:opacity-100 [&amp;amp;&amp;gt;*:not(:first-child)]:border-subtle [&amp;amp;&amp;gt;*:not(:first-child)]:border-l"&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;References&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Databricks forums and official docs recommend this pattern for validations, error handling, and conditional branches in DLT.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If you need nested conditional logic or true "branching," you might need to orchestrate pipelines via external tools (Databricks Jobs, Airflow, etc.) or preprocess thresholds before ingesting to DLT. For most operational use cases, filtering/gating via tables as described above is the validated approach.&lt;/P&gt;</description>
      <pubDate>Sun, 16 Nov 2025 17:48:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/conditional-execution-in-dlt-pipeline-based-on-the-output/m-p/139237#M51121</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-16T17:48:23Z</dc:date>
    </item>
  </channel>
</rss>

