<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta table partition directories when column mapping is enabled in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31816#M23180</link>
    <description>&lt;P&gt;Is there at least an explanation why this is happening and whether it affects performance?&lt;/P&gt;</description>
    <pubDate>Tue, 04 Apr 2023 11:51:14 GMT</pubDate>
    <dc:creator>AleksAngelova</dc:creator>
    <dc:date>2023-04-04T11:51:14Z</dc:date>
    <item>
      <title>Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31811#M23175</link>
      <description>&lt;P&gt;I recently created a table on a cluster in Azure running Databricks Runtime 11.1.  The table is partitioned by a "date" column.  I enabled column mapping, like this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;ALTER TABLE {schema}.{table_name} SET TBLPROPERTIES('delta.columnMapping.mode' = 'name', 'delta.minReaderVersion' = '2', 'delta.minWriterVersion' = '5')&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Before enabling column mapping, the directory containing the Delta table has the expected partition directories: "date=2022-08-18", "date=2022-08-19", etc.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;After enabling column mapping, every time I do a MERGE into that table, I get new directories created with short names like "5k", "Rw", "Yd", etc.  When I VACUUM the table, most of the directories are empty, but the empty directories are not removed.  We merge into this table frequently, so the table containing the Delta table ends up with lots and lots of empty directories.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have 2 questions:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is it expected that these directories will be created with names other than the expected  "date=2022-08-18"?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a way to make VACUUM remove the empty directories?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I could write code to walk through the Delta table directory and remove the empty directories, but I would rather not touch those directories!  That's for Databricks to manage, and I don't want to step in its way.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in advance for any information you can provide.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Sep 2022 18:20:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31811#M23175</guid>
      <dc:creator>Gary_Irick</dc:creator>
      <dc:date>2022-09-13T18:20:21Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31813#M23177</link>
      <description>&lt;P&gt;Hi @Gary Irick​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Does @Debayan Mukherjee​&amp;nbsp; response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We'd love to hear from you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Sep 2022 12:11:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31813#M23177</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-09-27T12:11:31Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31814#M23178</link>
      <description>&lt;P&gt;The same is happening with me. Since enabling column mapping, the new records are stored in folders with random names instead of being stored in its partition folder&lt;/P&gt;</description>
      <pubDate>Fri, 16 Dec 2022 17:26:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31814#M23178</guid>
      <dc:creator>gongasxavi</dc:creator>
      <dc:date>2022-12-16T17:26:06Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31815#M23179</link>
      <description>&lt;P&gt;Same issue is happening with me too since enabling column mapping. Files are stored in folders with random 2 character names (0P, 3h, BB) rather than the date value of the load_date partition column (load_date=2023-01-01, load_date=2023-01-02).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Have tried using databricks runtime 12.0 but get the same result when performing an append or merge operation. Has anyone been able to resolve this yet?&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2023 12:00:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31815#M23179</guid>
      <dc:creator>Pete_Cotton</dc:creator>
      <dc:date>2023-01-03T12:00:18Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31816#M23180</link>
      <description>&lt;P&gt;Is there at least an explanation why this is happening and whether it affects performance?&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 11:51:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31816#M23180</guid>
      <dc:creator>AleksAngelova</dc:creator>
      <dc:date>2023-04-04T11:51:14Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31812#M23176</link>
      <description>&lt;P&gt;Hi, For removing files or directories using VACUUM , you can refer &lt;A href="https://docs.databricks.com/delta/delta-utility.html#remove-files-no-longer-referenced-by-a-delta-table" alt="https://docs.databricks.com/delta/delta-utility.html#remove-files-no-longer-referenced-by-a-delta-table" target="_blank"&gt;https://docs.databricks.com/delta/delta-utility.html#remove-files-no-longer-referenced-by-a-delta-table&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As far as I know, the dates will be the default naming syntax, which can be renamed. &lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 04:55:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/31812#M23176</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2022-09-16T04:55:52Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/37524#M26381</link>
      <description>&lt;P&gt;seen the same behavior. waiting for some explanation.&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jul 2023 19:37:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/37524#M26381</guid>
      <dc:creator>nan</dc:creator>
      <dc:date>2023-07-12T19:37:17Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/37527#M26383</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/63510"&gt;@Gary_Irick&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/27080"&gt;@Pete_Cotton&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;This is expected.&amp;nbsp;&lt;SPAN&gt;Enabling column mapping enables random file prefixes, which removes the ability to explore data using Hive-style partitioning.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;This is also documented here -&amp;nbsp;&lt;A href="https://docs.databricks.com/delta/delta-column-mapping.html#:~:text=Enabling%20column%20mapping%20also%20enables%20random%20file%20prefixes%2C%20which%20removes%20the%20ability%20to%20explore%20data%20using%20Hive%2Dstyle%20partitioning" target="_blank"&gt;https://docs.databricks.com/delta/delta-column-mapping.html#:~:text=Enabling%20column%20mapping%20also%20enables%20random%20file%20prefixes%2C%20which%20removes%20the%20ability%20to%20explore%20data%20using%20Hive%2Dstyle%20partitioning&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jul 2023 20:52:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/37527#M26383</guid>
      <dc:creator>Tharun-Kumar</dc:creator>
      <dc:date>2023-07-12T20:52:21Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/53147#M29697</link>
      <description>&lt;P&gt;Same is happening to me and very frustrating as it irreversibly breaks our process.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2023 16:48:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/53147#M29697</guid>
      <dc:creator>vascoa</dc:creator>
      <dc:date>2023-11-20T16:48:11Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/81060#M36213</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have few queries on&amp;nbsp;&lt;STRONG&gt;Directory Names with Column Mapping.&amp;nbsp;&lt;/STRONG&gt;I have this delta table on ADLS and I am trying to read it, but I am getting below error. How can we read delta tables with column mapping enabled with pyspark?&lt;/P&gt;&lt;P&gt;Can you please help.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;A partition path fragment should be the form like `part1=foo/part2=bar`. The partition path: {{delta table name}}&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Edit:&lt;/P&gt;&lt;P&gt;I was able to read tables as is. Maybe some issue with delta version&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Nikhil&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2024 10:13:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/81060#M36213</guid>
      <dc:creator>talenik</dc:creator>
      <dc:date>2024-07-30T10:13:52Z</dc:date>
    </item>
    <item>
      <title>Re: Delta table partition directories when column mapping is enabled</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/144460#M52325</link>
      <description>&lt;P&gt;Still same behaviour when&amp;nbsp;&lt;STRONG&gt;Column Mapping enabled&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Jan 2026 16:59:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-partition-directories-when-column-mapping-is-enabled/m-p/144460#M52325</guid>
      <dc:creator>Narsikakunuri</dc:creator>
      <dc:date>2026-01-19T16:59:05Z</dc:date>
    </item>
  </channel>
</rss>

