<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic DLT behaving differently when used with python syntax vs when used with sql syntax to read CDF in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/107744#M42913</link>
    <description>&lt;P&gt;I was trying t read CDF data of a table as a DLT materialized view.&lt;/P&gt;&lt;P&gt;It works fine with sql syntax reading all the columns of the source table along with the 3 CDF columns : _change_type,_commit_timestamp,_commit_version:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;change_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_change &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark&lt;/SPAN&gt;&lt;SPAN&gt;.sql&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;FROM&lt;/SPAN&gt;&amp;nbsp;table_changes(&amp;lt;source_table_name&amp;gt;,1)&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt;(df_change)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;But when I try the same with python it just reads the columns of the source table leaving out the CDF columns : _change_type,_commit_timestamp,_commit_version:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;change_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_change &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'readChangeFeed'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;'True'&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'startingVersion'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&amp;lt;source_table_name&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt;(df_change)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Thu, 30 Jan 2025 09:00:53 GMT</pubDate>
    <dc:creator>Puspak</dc:creator>
    <dc:date>2025-01-30T09:00:53Z</dc:date>
    <item>
      <title>DLT behaving differently when used with python syntax vs when used with sql syntax to read CDF</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/107744#M42913</link>
      <description>&lt;P&gt;I was trying t read CDF data of a table as a DLT materialized view.&lt;/P&gt;&lt;P&gt;It works fine with sql syntax reading all the columns of the source table along with the 3 CDF columns : _change_type,_commit_timestamp,_commit_version:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;change_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_change &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark&lt;/SPAN&gt;&lt;SPAN&gt;.sql&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;FROM&lt;/SPAN&gt;&amp;nbsp;table_changes(&amp;lt;source_table_name&amp;gt;,1)&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt;(df_change)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;But when I try the same with python it just reads the columns of the source table leaving out the CDF columns : _change_type,_commit_timestamp,_commit_version:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;change_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_change &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'readChangeFeed'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;'True'&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'startingVersion'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&amp;lt;source_table_name&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt;(df_change)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 30 Jan 2025 09:00:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/107744#M42913</guid>
      <dc:creator>Puspak</dc:creator>
      <dc:date>2025-01-30T09:00:53Z</dc:date>
    </item>
    <item>
      <title>Re: DLT behaving differently when used with python syntax vs when used with sql syntax to read CDF</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/107851#M42933</link>
      <description>&lt;P&gt;But the same python code works fine when executed outside of a DLT pipeline. When I run the following in an interactive notebook it returns the source columns + CDF columns, which is logical because I am using the readChangeFeed option while reading.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'readChangeFeed'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;'True'&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'startingVersion'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&amp;lt;source_table_name&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The problem I stated occurs only when it is executed within a DLT pipeline which is strange.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Jan 2025 16:09:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/107851#M42933</guid>
      <dc:creator>Puspak</dc:creator>
      <dc:date>2025-01-30T16:09:52Z</dc:date>
    </item>
    <item>
      <title>Re: DLT behaving differently when used with python syntax vs when used with sql syntax to read CDF</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/139235#M51119</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;When accessing Change Data Feed (CDF) data in Delta Live Tables (DLT), the behavior between SQL and Python APIs differs notably regarding CDF metadata columns—&lt;CODE&gt;_change_type&lt;/CODE&gt;,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;_commit_timestamp&lt;/CODE&gt;, and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;_commit_version&lt;/CODE&gt;.&lt;/P&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;SQL Approach (using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;table_changes&lt;/CODE&gt;)&lt;/STRONG&gt;:&lt;BR /&gt;The SQL syntax&lt;BR /&gt;&lt;CODE&gt;SELECT * FROM table_changes(&amp;lt;source_table_name&amp;gt;, 1)&lt;/CODE&gt;&lt;BR /&gt;&lt;STRONG&gt;always returns&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;all columns from the source table&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;plus&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;the three CDF columns.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Python API Approach (using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.option('readChangeFeed','True')&lt;/CODE&gt;)&lt;/STRONG&gt;:&lt;BR /&gt;When you use&lt;BR /&gt;&lt;CODE&gt;spark.read.option('readChangeFeed','True').option('startingVersion', 1).table(&amp;lt;source_table_name&amp;gt;)&lt;/CODE&gt;&lt;BR /&gt;&lt;STRONG&gt;without explicit CDF columns referenced&lt;/STRONG&gt;, often the returned dataframe omits the three CDF columns, showing only the data columns from the source table. This is a known difference in behavior attributed to the underlying CDF implementation and how schema inference is treated in the Python API.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;How to Access CDF Columns in Python&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;To ensure the CDF columns are included when reading from a Delta table using the Python API, explicitly reference the relevant columns:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;df_change &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; spark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;read&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;'readChangeFeed'&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;'True'&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;option&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;'startingVersion'&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;1&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;table&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token operator"&gt;&amp;lt;&lt;/SPAN&gt;source_table_name&lt;SPAN class="token token operator"&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
df_change &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df_change&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;select&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"*"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;  &lt;SPAN class="token token"&gt;# In some environments, this reveals the CDF columns&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;# Alternatively, explicitly select the columns if "*" doesn't work:&lt;/SPAN&gt;
df_change &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; df_change&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;select&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;"*"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;"_change_type"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;"_commit_timestamp"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;"_commit_version"&lt;/SPAN&gt;
&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If the columns still do not appear after using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.select("*")&lt;/CODE&gt;, use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.select()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;naming the columns directly as above. Ensure your environment is running on Databricks Runtime 9.0 or newer and that CDF is properly enabled for the Delta table.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Workaround and Best Practices&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Use the SQL interface (&lt;CODE&gt;table_changes&lt;/CODE&gt;) if you need a direct, complete result including CDF columns in DLT materialized views.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;For Python, verify the dataframe schema with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;df_change.printSchema()&lt;/CODE&gt;—if CDF columns are missing, switch to explicit&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.select()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;or fallback to using a SQL query within Python (&lt;CODE&gt;spark.sql("SELECT * FROM table_changes(...)")&lt;/CODE&gt;).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This behavior is subject to Databricks and Delta Lake updates; always refer to the current documentation for feature changes.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Quick Table Comparison&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Method&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Data Columns&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;CDF Columns&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;table_changes&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(SQL)&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;spark.read&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;option (Python API)&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Usually No&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;DIV class="bg-base border-subtler shadow-subtle pointer-coarse:opacity-100 right-xs absolute bottom-0 flex rounded-lg border opacity-0 transition-opacity group-hover:opacity-100 [&amp;amp;&amp;gt;*:not(:first-child)]:border-subtle [&amp;amp;&amp;gt;*:not(:first-child)]:border-l"&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;To reliably capture CDF columns in Python, stick to SQL-based approaches or check your DataFrame schema and select columns explicitly. This is a well-documented limitation and workaround in Databricks.&lt;/P&gt;</description>
      <pubDate>Sun, 16 Nov 2025 17:45:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-behaving-differently-when-used-with-python-syntax-vs-when/m-p/139235#M51119</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-16T17:45:36Z</dc:date>
    </item>
  </channel>
</rss>

