cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dataframe.write with table containing Always generate columns and auto generate columns is failing(SQL SERVER + sql-spark-connector)

User16765131552
Contributor III

Dataframe write to SQL Server table containing Always autogenerate column fails. I am using Apache Spark Connector for SQL Server and Azure SQL. When autogenerate field are not included in dataframe, I encountered - "No key found " error If auto-generate columns are included in dataframe, I encountered "Cannot insert an explicit value into a GENERATED ALWAYS column in table error."

Azure Databricks- 7.6 runtime Azure SQL database Language - PySpark

Exception encountered

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 25, 10.139.64.4, executor 1): com.microsoft.sqlserver.jdbc.SQLServerException: Cannot insert an explicit value into a GENERATED ALWAYS column in table '<

'. Use INSERT with a column list to exclude the GENERATED ALWAYS column, or insert a DEFAULT into GENERATED ALWAYS column.

PySpark code df = read parquet file

df.write \
  .format("com.microsoft.sqlserver.jdbc.spark") \
  .mode("append") \
  .option("url", url) \
  .option("dbtable", "TEMPORAL_TABLE") \
  .option("user", _username) \
  .option("password", _password) \
  .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")\
  .option("schemaCheckEnabled", False)\
  .save()

Azure SQL Temporal Table Definition:

CREATE TABLE DBO.TEMPORAL_TABLE(
    [UUID] [varchar](255) NOT NULL,
    [SERVICE_ID] [bigint] NULL,
    [START_DATE] [datetime2](7) NULL,
    [END_DATE] [datetime2](7) NULL,
    [CHANGED_ON] [datetime2](7) NULL,
    [operation] [char](1) NULL,
    [SysStartTime] [datetime2](7) GENERATED ALWAYS AS ROW START NOT NULL,
    [SysEndTime] [datetime2](7) GENERATED ALWAYS AS ROW END NOT NULL,
 CONSTRAINT [PK_TEMPORAL] PRIMARY KEY CLUSTERED 
(
    [UUID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY],
    PERIOD FOR SYSTEM_TIME ([SysStartTime], [SysEndTime])
) ON [PRIMARY]
WITH
(
SYSTEM_VERSIONING = ON ( HISTORY_TABLE = [history].[TEMPORAL_TABLE_HISTORY] )
)
GO
 
ALTER TABLE DBO.TEMPORAL_TABLE ADD  DEFAULT (newid()) FOR [UUID]
GO
 
ALTER TABLE DBO.TEMPORAL_TABLEADD  DEFAULT (getutcdate()) FOR [SysStartTime]
GO
 
ALTER TABLE DBO.TEMPORAL_TABLE ADD  DEFAULT (CONVERT([datetime2],'9999-12-31 23:59:59.9999999')) FOR [SysEndTime]

spark sql connector - https://github.com/microsoft/sql-spark-connector

0 REPLIES 0
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!