<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Issue with Auto Liquid clustering in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131841#M49257</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have written data to a table using clusterByAuto set to true&lt;/LI&gt;&lt;LI&gt;But the clustering keys are not selected automatically when i do a &lt;STRONG&gt;desc detail&lt;/STRONG&gt; on the table.Screenshot below&lt;BR /&gt;&lt;BR /&gt;Why are clustering columns not being selected automatically?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Repro steps:&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Create a dataframe&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Write to a delta table using clusterByAuto set to true&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Then run desc detail on teh table .You will not see any clustering columns selected&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
    <pubDate>Sat, 13 Sep 2025 07:58:23 GMT</pubDate>
    <dc:creator>RevanthV</dc:creator>
    <dc:date>2025-09-13T07:58:23Z</dc:date>
    <item>
      <title>Issue with Auto Liquid clustering</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131841#M49257</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I have written data to a table using clusterByAuto set to true&lt;/LI&gt;&lt;LI&gt;But the clustering keys are not selected automatically when i do a &lt;STRONG&gt;desc detail&lt;/STRONG&gt; on the table.Screenshot below&lt;BR /&gt;&lt;BR /&gt;Why are clustering columns not being selected automatically?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Repro steps:&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Create a dataframe&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Write to a delta table using clusterByAuto set to true&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Then run desc detail on teh table .You will not see any clustering columns selected&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Sat, 13 Sep 2025 07:58:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131841#M49257</guid>
      <dc:creator>RevanthV</dc:creator>
      <dc:date>2025-09-13T07:58:23Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with Auto Liquid clustering</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131845#M49261</link>
      <description>&lt;P&gt;Hello Revanth,&lt;/P&gt;&lt;P&gt;It could be that the&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The table is too small to benefit from liquid clustering, or it has a good clustering scheme. Could you tell me&amp;nbsp;the size of the table?&lt;/LI&gt;&lt;LI&gt;Also, if the table is not being frequently queried on a column or set of columns,ie.., there are not enough scans on the table, so it wouldn't benefit from clustering&lt;/LI&gt;&lt;LI&gt;Also, Auto Liquid works with PO in the background, so based on some internal thresholds, the clustering columns are selected, and we as Cx need not worry about that&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;So once you have frequent scans on the table columns, you will see clustering columns being selected&amp;nbsp;automatically.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;You can go through the doc for better understanding&lt;/STRONG&gt;:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/delta/clustering#:~:text=If%20a%20key%20was%20not%20selected%20by%20automatic%20liquid%20clustering%2C%20the%20reason%20can%20be%3A" target="_blank"&gt;https://docs.databricks.com/aws/en/delta/clustering#:~:text=If%20a%20key%20was%20not%20selected%20by%20automatic%20liquid%20clustering%2C%20the%20reason%20can%20be%3A&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Let me know if you have any further questions.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Sep 2025 08:42:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131845#M49261</guid>
      <dc:creator>K_Anudeep</dc:creator>
      <dc:date>2025-09-13T08:42:55Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with Auto Liquid clustering</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131846#M49262</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/184430"&gt;@RevanthV&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;As&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/60098"&gt;@K_Anudeep&lt;/a&gt;&amp;nbsp; correctly suggested it could be the case that your table is to small to benefit from liquid clustering.&lt;BR /&gt;Another possibility it that you're using runtime lower than 15.4 LTS.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Sep 2025 09:04:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131846#M49262</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-13T09:04:52Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with Auto Liquid clustering</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131855#M49266</link>
      <description>&lt;P&gt;Thanks a lot&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/60098"&gt;@K_Anudeep&lt;/a&gt;&amp;nbsp;, My table is still small and i guess that was the reason ,&amp;nbsp; have now written around 1million records and have been just running frequent scans on a column since last one hour and now i can see the same column selected as a clustering column.&lt;/P&gt;&lt;P&gt;Thanks a lot for your help&lt;/P&gt;</description>
      <pubDate>Sat, 13 Sep 2025 11:51:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-auto-liquid-clustering/m-p/131855#M49266</guid>
      <dc:creator>RevanthV</dc:creator>
      <dc:date>2025-09-13T11:51:51Z</dc:date>
    </item>
  </channel>
</rss>

