<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Update datatype of a column in a table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79305#M35729</link>
    <description>&lt;P&gt;I have a table in databricks with fields name: string, id: string, orgId: bigint, metadata: struct, now i want to rename one of the columns and change it type. In my case I want to update orgId to orgIds and change its type to map&amp;lt;string, string&amp;gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One approach I can think of is, add a new column to the table and write script to migrate the data from one column to the other. But there is no way to drop a column.&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I will have to create a new table altogether and do the data migration and then delete the old table.&lt;BR /&gt;Is there a better approach?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Jul 2024 20:24:11 GMT</pubDate>
    <dc:creator>mr_robot</dc:creator>
    <dc:date>2024-07-18T20:24:11Z</dc:date>
    <item>
      <title>Update datatype of a column in a table</title>
      <link>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79305#M35729</link>
      <description>&lt;P&gt;I have a table in databricks with fields name: string, id: string, orgId: bigint, metadata: struct, now i want to rename one of the columns and change it type. In my case I want to update orgId to orgIds and change its type to map&amp;lt;string, string&amp;gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One approach I can think of is, add a new column to the table and write script to migrate the data from one column to the other. But there is no way to drop a column.&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I will have to create a new table altogether and do the data migration and then delete the old table.&lt;BR /&gt;Is there a better approach?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jul 2024 20:24:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79305#M35729</guid>
      <dc:creator>mr_robot</dc:creator>
      <dc:date>2024-07-18T20:24:11Z</dc:date>
    </item>
    <item>
      <title>Re: Update datatype of a column in a table</title>
      <link>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79349#M35743</link>
      <description>&lt;P&gt;You can use REPLACE COLUMNS.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;ALTER TABLE your_table_name REPLACE COLUMNS (
    name STRING,
    id BIGINT,
    orgIds MAP&amp;lt;STRING, STRING&amp;gt;, 
    metadata STRUCT&amp;lt;...&amp;gt;
);&lt;/LI-CODE&gt;</description>
      <pubDate>Fri, 19 Jul 2024 07:36:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79349#M35743</guid>
      <dc:creator>jacovangelder</dc:creator>
      <dc:date>2024-07-19T07:36:39Z</dc:date>
    </item>
    <item>
      <title>Re: Update datatype of a column in a table</title>
      <link>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79360#M35744</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/102253"&gt;@jacovangelder&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/112781"&gt;@mr_robot&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I think &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/102253"&gt;@jacovangelder&lt;/a&gt;&amp;nbsp; approach will not work for delta table. According to documentation:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;REPLACE COLUMNS deletes all the existing columns and adds new columns specified.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;And Delta format don't support column deletion.&lt;/P&gt;&lt;P&gt;So, the only way to do this is to overwrite to the table, as is stated in documentation:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/en/delta/update-schema.html#explicitly-update-schema-to-change-column-type-or-name" target="_blank" rel="noopener"&gt;https://docs.databricks.com/en/delta/update-schema.html#explicitly-update-schema-to-change-column-type-or-name&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jul 2024 08:38:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79360#M35744</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-07-19T08:38:00Z</dc:date>
    </item>
    <item>
      <title>Re: Update datatype of a column in a table</title>
      <link>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79364#M35745</link>
      <description>&lt;P&gt;You're right, it's a metadata only change, but it will evolve the schema.&amp;nbsp;&lt;BR /&gt;If you need to rewrite the files as well, then you'll need to overwrite.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jul 2024 09:01:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-datatype-of-a-column-in-a-table/m-p/79364#M35745</guid>
      <dc:creator>jacovangelder</dc:creator>
      <dc:date>2024-07-19T09:01:07Z</dc:date>
    </item>
  </channel>
</rss>

