cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Column Name Case sensitivity in DLT pipeline

AmanSehgal
Honored Contributor III

I've a DLT pipeline that processes messages from event grid. The schema of the message has two columns in different cases - "employee_id" and  "employee_ID",

I tried setting spark.sql.caseSensitive to true in my DLT notebook as well in DLT configuration, but it didn't work. It works in normal pyspark notebook, however it fails in DLT.

Error:

terminated with exception: [DELTA_DUPLICATE_COLUMNS_FOUND] Found duplicate column(s) in the data to save: data.message.empdetail.employee_id SQLSTATE: XXKST

 

 

1 REPLY 1

Renu_
Contributor III

Hi @AmanSehgal, DLT treat column names as case-insensitive, even if spark.sql.caseSensitive is set to true. Thatโ€™s why employee_id and employee_ID are seen as duplicates and cause the error. To fix this, youโ€™ll need to rename one of the columns so your schema has distinct names regardless of case.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now