<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Not able to insert into identity column through spark in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102282#M41055</link>
    <description>&lt;P&gt;Hey &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/123857"&gt;@Shivaprasad&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Welcome :). Please consider to mark my last answer as a solution so others can easily find.&lt;/P&gt;</description>
    <pubDate>Mon, 16 Dec 2024 17:42:58 GMT</pubDate>
    <dc:creator>PiotrMi</dc:creator>
    <dc:date>2024-12-16T17:42:58Z</dc:date>
    <item>
      <title>Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102113#M40969</link>
      <description>&lt;P&gt;I have a delta table with identity column and not able to insert data using spark. I am using 15.4LTS. any idea what needed to be done&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Table Name : account&lt;/P&gt;&lt;P&gt;Column Def: account_dimension_id &lt;SPAN&gt;BIGINT&lt;/SPAN&gt; &lt;SPAN&gt;GENERATED&lt;/SPAN&gt; &lt;SPAN&gt;BY&lt;/SPAN&gt; &lt;SPAN&gt;DEFAULT&lt;/SPAN&gt; &lt;SPAN&gt;AS&lt;/SPAN&gt; &lt;SPAN&gt;IDENTITY&lt;/SPAN&gt;,&lt;/P&gt;&lt;P&gt;df = spark.read.format(&lt;SPAN&gt;"csv"&lt;/SPAN&gt;).load(&lt;SPAN&gt;"abfss://databricks-storage@sa14127e1dv0101.dfs.core.windows.net/catalogs/data-relationship-management-drm/onprem_acct_dimension.csv"&lt;/SPAN&gt;)&lt;SPAN&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;df.write.format("delta").mode("append").insertInto("account")&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I am able to insert through sql but not through spark&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;file has 7 columns but table has 8 columns including the identity column&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;Error:&amp;nbsp;error: [&lt;A class="" title="https://docs.microsoft.com/azure/databricks/error-messages/error-classes#delta_insert_column_arity_mismatch" href="https://docs.microsoft.com/azure/databricks/error-messages/error-classes#delta_insert_column_arity_mismatch" target="_blank" rel="noreferrer noopener"&gt;DELTA_INSERT_COLUMN_ARITY_MISMATCH&lt;/A&gt;] Cannot write to 'data_relationship_managment_pdz.data_relationship_managment_schema.account', not enough data columns; target table has 8 column(s) but the inserted data has 7 column(s) SQLSTATE: 42802&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Dec 2024 17:48:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102113#M40969</guid>
      <dc:creator>Shivaprasad</dc:creator>
      <dc:date>2024-12-13T17:48:10Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102117#M40971</link>
      <description>&lt;P&gt;The error states that there is a missing column on the data that you are adding if you print the df and so a select on the table do you see that all the columns matches?&lt;/P&gt;</description>
      <pubDate>Fri, 13 Dec 2024 20:31:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102117#M40971</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2024-12-13T20:31:45Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102121#M40972</link>
      <description>&lt;P&gt;My first column in the table is the identity column and my expectation is it should get auto incremented so it's not in my data frame. I am able to insert data through sql and identity column get auto incremented but not when I try to insert through spark.&lt;/P&gt;</description>
      <pubDate>Fri, 13 Dec 2024 21:08:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102121#M40972</guid>
      <dc:creator>Shivaprasad</dc:creator>
      <dc:date>2024-12-13T21:08:30Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102139#M40981</link>
      <description>&lt;P&gt;Hey &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/123857"&gt;@Shivaprasad&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you try and switch last part "insertInto" to "saveAsTable"?&lt;/P&gt;&lt;P&gt;df.write.mode("append").format("delta").saveAsTable(&lt;SPAN&gt;&amp;nbsp;"account")&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 14 Dec 2024 17:43:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102139#M40981</guid>
      <dc:creator>PiotrMi</dc:creator>
      <dc:date>2024-12-14T17:43:17Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102179#M41004</link>
      <description>&lt;P&gt;Thanks. Still giving data mismatch error -&lt;/P&gt;&lt;P&gt;table : account&lt;BR /&gt;account_dimension_id BIGINT GENERATED BY DEFAULT AS IDENTITY,&lt;/P&gt;&lt;P&gt;df = spark.read.format("csv").option("header", "true").load("abfss://databricks-storage@sa14127e1dv0101.dfs.core.windows.net/catalogs/data-relationship-management-drm/onprem_acct_dimension.csv")&lt;BR /&gt;for col in df.columns:&lt;BR /&gt;df = df.withColumnRenamed(col, col.lower())&lt;BR /&gt;display(df)&lt;BR /&gt;df.write.format("delta").mode("append").option("mergeSchema", "true").saveAsTable("account")&lt;/P&gt;&lt;P&gt;Error:&lt;BR /&gt;[DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'insert_date' and 'insert_date' SQLSTATE: 22005&lt;BR /&gt;File &amp;lt;command-2201570317935975&amp;gt;, line 8&lt;BR /&gt;&amp;nbsp;df = df.withColumnRenamed(col, col.lower())&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 02:18:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102179#M41004</guid>
      <dc:creator>Shivaprasad</dc:creator>
      <dc:date>2024-12-16T02:18:18Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102211#M41021</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/123857"&gt;@Shivaprasad&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think you have error due to different datatypes for column&amp;nbsp;&lt;SPAN&gt;insert_date. Probably its coming as a string from CSV file. Please check what are both data types - in tables and in dataframe created based on file. Cast it if needed. Error should be fixed then.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 10:31:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102211#M41021</guid>
      <dc:creator>PiotrMi</dc:creator>
      <dc:date>2024-12-16T10:31:48Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102281#M41054</link>
      <description>&lt;P&gt;Thanks, that resolved the issue&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 17:13:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102281#M41054</guid>
      <dc:creator>Shivaprasad</dc:creator>
      <dc:date>2024-12-16T17:13:53Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to insert into identity column through spark</title>
      <link>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102282#M41055</link>
      <description>&lt;P&gt;Hey &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/123857"&gt;@Shivaprasad&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Welcome :). Please consider to mark my last answer as a solution so others can easily find.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 17:42:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/not-able-to-insert-into-identity-column-through-spark/m-p/102282#M41055</guid>
      <dc:creator>PiotrMi</dc:creator>
      <dc:date>2024-12-16T17:42:58Z</dc:date>
    </item>
  </channel>
</rss>

