<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Surrogate key using identity column. in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/surrogate-key-using-identity-column/m-p/7774#M3548</link>
    <description>&lt;P&gt;I want to create a surrogate in the delta table&amp;nbsp;And i used&amp;nbsp;the identity column id-Generated as Default&lt;/P&gt;&lt;P&gt;Can i insert rows into the delta table using only spark.sql like Insert query ? or i can also use write delta format options? If i use the df.write it says mismatched columns ? Can someone help on how to implement this ?&lt;/P&gt;</description>
    <pubDate>Tue, 14 Mar 2023 00:04:23 GMT</pubDate>
    <dc:creator>Dataengineer_mm</dc:creator>
    <dc:date>2023-03-14T00:04:23Z</dc:date>
    <item>
      <title>Surrogate key using identity column.</title>
      <link>https://community.databricks.com/t5/data-engineering/surrogate-key-using-identity-column/m-p/7774#M3548</link>
      <description>&lt;P&gt;I want to create a surrogate in the delta table&amp;nbsp;And i used&amp;nbsp;the identity column id-Generated as Default&lt;/P&gt;&lt;P&gt;Can i insert rows into the delta table using only spark.sql like Insert query ? or i can also use write delta format options? If i use the df.write it says mismatched columns ? Can someone help on how to implement this ?&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2023 00:04:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/surrogate-key-using-identity-column/m-p/7774#M3548</guid>
      <dc:creator>Dataengineer_mm</dc:creator>
      <dc:date>2023-03-14T00:04:23Z</dc:date>
    </item>
    <item>
      <title>Re: Surrogate key using identity column.</title>
      <link>https://community.databricks.com/t5/data-engineering/surrogate-key-using-identity-column/m-p/7775#M3549</link>
      <description>&lt;P&gt;Hello @Menaka Murugesan​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you are using the identity column, I believe you would have created the table as below, (starts with value 1 and step 1)&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE TABLE my_table (
  id INT IDENTITY (1, 1) PRIMARY KEY,
  value STRING
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;You can insert values in the identity table as&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;INSERT INTO my_table (value) VALUES ('Value 1'), ('Value 2'), ('Value 3')&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The &lt;I&gt;id&lt;/I&gt; column values will be automatically generated by the identity column with values of 1, 2, and 3 respectively.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Alternatively, you can also use Spark SQL to insert data into the Delta table with an identity column. Here's an example of how to do this:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sql("INSERT INTO my_table (value) VALUES ('Value 1'), ('Value 2'), ('Value 3')")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;To write data into a Delta table with an identity column using the df.write API in PySpark&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;data = [("Value 1",), ("Value 2",), ("Value 3",)]
columns = ["value"]
&amp;nbsp;
df = spark.createDataFrame(data, columns)
&amp;nbsp;
df.write.format("delta").option("path", "/path/to/my_table").option("overwriteSchema", "true").option("idCol", "id").save()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The id column will be auto generated. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I think you are getting mismatched columns error because you are trying to insert the value for id? Please clarify further on the error and how are you inserting the value.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2023 12:14:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/surrogate-key-using-identity-column/m-p/7775#M3549</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2023-03-14T12:14:08Z</dc:date>
    </item>
  </channel>
</rss>

