<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Unable to configure clustering on DLT tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134574#M50154</link>
    <description>&lt;P&gt;Hi Team&lt;/P&gt;&lt;P&gt;I have a DLT pipeline with `cluster_by` property configured for all my tables. The code looks something like below:&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table( name="flows", cluster_by=["from"] ) def flows(): &amp;lt;LOGIC&amp;gt;&lt;/P&gt;&lt;P&gt;It was all working fine and in couple of days, the queries were taking forever and when I checked my dlt tables. I couldn't find any cluster properties configured. I tried setting 'cluster_by_auto=True' and it was properly configured but the cluster columns are not taken into consideration.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is this some bug with the latest release or is there a way to solve this?&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
    <pubDate>Fri, 10 Oct 2025 19:04:57 GMT</pubDate>
    <dc:creator>Chris_N</dc:creator>
    <dc:date>2025-10-10T19:04:57Z</dc:date>
    <item>
      <title>Unable to configure clustering on DLT tables</title>
      <link>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134574#M50154</link>
      <description>&lt;P&gt;Hi Team&lt;/P&gt;&lt;P&gt;I have a DLT pipeline with `cluster_by` property configured for all my tables. The code looks something like below:&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table( name="flows", cluster_by=["from"] ) def flows(): &amp;lt;LOGIC&amp;gt;&lt;/P&gt;&lt;P&gt;It was all working fine and in couple of days, the queries were taking forever and when I checked my dlt tables. I couldn't find any cluster properties configured. I tried setting 'cluster_by_auto=True' and it was properly configured but the cluster columns are not taken into consideration.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is this some bug with the latest release or is there a way to solve this?&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Fri, 10 Oct 2025 19:04:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134574#M50154</guid>
      <dc:creator>Chris_N</dc:creator>
      <dc:date>2025-10-10T19:04:57Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to configure clustering on DLT tables</title>
      <link>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134584#M50157</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/190725"&gt;@Chris_N&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;You have mentioned - "I couldn't find any cluster properties configured."&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;If they existed and were changed, you can use the &lt;A href="https://docs.databricks.com/aws/en/sql/language-manual/delta-describe-history" target="_self"&gt;delta history&lt;/A&gt; command to check if someone changed on the &lt;A href="https://docs.databricks.com/aws/en/delta/history" target="_self"&gt;clustering&lt;/A&gt; information.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;It is possible there were changes in the data volume/configs that could have led to the change in performance.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;On the next statement of yours - "&amp;nbsp;I tried setting 'cluster_by_auto=True' and it was properly configured but the cluster columns are not taken into consideration." - How do you determine this that, the cluster columns are not considered?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;After &lt;A href="https://docs.databricks.com/aws/en/delta/clustering#enable-liquid-clustering" target="_self"&gt;liquid clustering is enabled&lt;/A&gt;, run&amp;nbsp;&lt;CODE&gt;OPTIMIZE&lt;/CODE&gt;&amp;nbsp;jobs as usual to incrementally cluster data. See&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/delta/clustering#optimize" target="_blank"&gt;How to trigger clustering&lt;/A&gt;.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Liquid clustering is &lt;A href="https://docs.databricks.com/aws/en/delta/clustering#how-to-trigger-clustering" target="_self"&gt;incremental&lt;/A&gt;, meaning that data is only rewritten as necessary to accommodate data that needs to be clustered. Data files with clustering keys that do not match the data to be clustered are not rewritten.&lt;/P&gt;
&lt;P&gt;Please let me know of additional informations, so that I can suggest further. Additionally, if you have a support subscription, a support ticket can also be raised to investigate this.&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Nandini&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Oct 2025 21:40:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134584#M50157</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2025-10-10T21:40:45Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to configure clustering on DLT tables</title>
      <link>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134629#M50158</link>
      <description>&lt;P&gt;There was no changes to the clustering configuration of the DLT table (Materialized view).&lt;/P&gt;&lt;P&gt;I am checking if the clustering fields are properly configured using the SQL command `DESCRIBE EXTENDED &amp;lt;table&amp;gt;`. While running this on a normal table with clustering configured, I can see `&lt;SPAN&gt;clusteringColumns` property under the TableProperties section.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The same does not exist for the DLT tables created. This issue exists for all my DLT tables across different pipelines.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For recreating the issue, please create a DLT table using the below syntax and check if the clusteringColumns property is properly set on the table&lt;/SPAN&gt;&lt;/P&gt;&lt;P data-unlink="true"&gt;&lt;SPAN&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table( name="flows", cluster_by=["from"] ) def flows(): &amp;lt;LOGIC&amp;gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Oct 2025 10:20:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134629#M50158</guid>
      <dc:creator>Chris_N</dc:creator>
      <dc:date>2025-10-11T10:20:42Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to configure clustering on DLT tables</title>
      <link>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134730#M50177</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/190725"&gt;@Chris_N&lt;/a&gt;&amp;nbsp; maybe cluster_by for dlt's does not persist in delta table metadata properties as this is a hint for dlt pipelines and thus you cannot see it.&lt;/P&gt;&lt;P&gt;As other have confirmed if your tables are not managed tables and you are not using predective optimization then you have to make sure to optimize/z-order tables regularly as a separate workflow/job to make your querties run faster.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Note:&amp;nbsp;&lt;/STRONG&gt;Over time, as new data is appended and files are rewritten, the clustering effect can diminish unless you regularly OPTIMIZE/Z-ORDER the table.&lt;/P&gt;&lt;P&gt;Br&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Oct 2025 11:36:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unable-to-configure-clustering-on-dlt-tables/m-p/134730#M50177</guid>
      <dc:creator>saurabh18cs</dc:creator>
      <dc:date>2025-10-13T11:36:37Z</dc:date>
    </item>
  </channel>
</rss>

