<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Update DeltaTable on column type ArrayType(): add element to array in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/update-deltatable-on-column-type-arraytype-add-element-to-array/m-p/66622#M33170</link>
    <description>&lt;P&gt;Does it mean that to add an element to an array we have first read all the elements of the array, then add new one, the save new array?&lt;/P&gt;</description>
    <pubDate>Thu, 18 Apr 2024 18:21:56 GMT</pubDate>
    <dc:creator>SerhiiShyika</dc:creator>
    <dc:date>2024-04-18T18:21:56Z</dc:date>
    <item>
      <title>Update DeltaTable on column type ArrayType(): add element to array</title>
      <link>https://community.databricks.com/t5/data-engineering/update-deltatable-on-column-type-arraytype-add-element-to-array/m-p/56503#M30579</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I need to perform an Update on a Delta Table adding elements to a column of ArrayType(StringType()) which is initialized empty.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Before Update&lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Col_1 StringType()&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Col_2 StringType()&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Col_3 ArrayType()&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Val&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Val&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;[ ]&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;STRONG&gt;After Update&lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Col_1 StringType()&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Col_2 StringType()&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Col_3 ArrayType()&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Val&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Val&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;[ 'append value' ]&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;I'm trying with Update syntax but a receive errors within the "&lt;EM&gt;set&lt;/EM&gt;" statement since updated value type (StringType) is not consistent with target one - ArrayType(StringType()):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;schema = StructType([ 
StructField("Load_id", StringType(), True), 
StructField("Task_id", StringType(), True), 
StructField("Task_output", StringType(), True), 
StructField("Task_output_detail", ArrayType(StringType()), True), StructField("Execution_ts", TimestampType(), True), 
StructField("Task_status", StringType(), True)]) 

#some code to init materialize the delta table

Task_output_detail = "Invalid value" 
Log_table = DeltaTable.forPath(spark, path) 
Log_table.update( 
condition = (col("Load_id")== Load_id) &amp;amp; (col("Task_id")== Task_id), 
set = { "Task_output": lit(Task_output), "Task_output_detail": Task_output_detail, "Execution_ts": lit(Execution_ts), "Task_status": lit('Closed')})&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Does anyone know a "smart" solution or correct syntaxt to achieve that? I want to avoid deleting the raw and creating a new one since I have to perform multiple updates / appends.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2024 12:31:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-deltatable-on-column-type-arraytype-add-element-to-array/m-p/56503#M30579</guid>
      <dc:creator>carlosancassani</dc:creator>
      <dc:date>2024-01-05T12:31:16Z</dc:date>
    </item>
    <item>
      <title>Re: Update DeltaTable on column type ArrayType(): add element to array</title>
      <link>https://community.databricks.com/t5/data-engineering/update-deltatable-on-column-type-arraytype-add-element-to-array/m-p/66622#M33170</link>
      <description>&lt;P&gt;Does it mean that to add an element to an array we have first read all the elements of the array, then add new one, the save new array?&lt;/P&gt;</description>
      <pubDate>Thu, 18 Apr 2024 18:21:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-deltatable-on-column-type-arraytype-add-element-to-array/m-p/66622#M33170</guid>
      <dc:creator>SerhiiShyika</dc:creator>
      <dc:date>2024-04-18T18:21:56Z</dc:date>
    </item>
  </channel>
</rss>

