cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Surrogate key using identity column.

Dataengineer_mm
New Contributor

I want to create a surrogate in the delta table And i used the identity column id-Generated as Default

Can i insert rows into the delta table using only spark.sql like Insert query ? or i can also use write delta format options? If i use the df.write it says mismatched columns ? Can someone help on how to implement this ?

1 REPLY 1

NandiniN
Databricks Employee
Databricks Employee

Hello @Menaka Murugesanโ€‹ ,

If you are using the identity column, I believe you would have created the table as below, (starts with value 1 and step 1)

CREATE TABLE my_table (
  id INT IDENTITY (1, 1) PRIMARY KEY,
  value STRING
)

You can insert values in the identity table as

INSERT INTO my_table (value) VALUES ('Value 1'), ('Value 2'), ('Value 3')

The id column values will be automatically generated by the identity column with values of 1, 2, and 3 respectively.

Alternatively, you can also use Spark SQL to insert data into the Delta table with an identity column. Here's an example of how to do this:

spark.sql("INSERT INTO my_table (value) VALUES ('Value 1'), ('Value 2'), ('Value 3')")

To write data into a Delta table with an identity column using the df.write API in PySpark

data = [("Value 1",), ("Value 2",), ("Value 3",)]
columns = ["value"]
 
df = spark.createDataFrame(data, columns)
 
df.write.format("delta").option("path", "/path/to/my_table").option("overwriteSchema", "true").option("idCol", "id").save()

The id column will be auto generated.

I think you are getting mismatched columns error because you are trying to insert the value for id? Please clarify further on the error and how are you inserting the value.

Hope this helps.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group