Column Name Case sensitivity in DLT pipeline

AmanSehgal — Sun, 18 May 2025 15:58:43 GMT

I've a DLT pipeline that processes messages from event grid. The schema of the message has two columns in different cases - "employee_id" and "employee_ID",

I tried setting spark.sql.caseSensitive to true in my DLT notebook as well in DLT configuration, but it didn't work. It works in normal pyspark notebook, however it fails in DLT.

Error:

terminated with exception: [DELTA_DUPLICATE_COLUMNS_FOUND] Found duplicate column(s) in the data to save: data.message.empdetail.employee_id SQLSTATE: XXKST

Re: Column Name Case sensitivity in DLT pipeline

Renu_ — Mon, 19 May 2025 14:10:06 GMT

Hi @AmanSehgal, DLT treat column names as case-insensitive, even if spark.sql.caseSensitive is set to true. That’s why employee_id and employee_ID are seen as duplicates and cause the error. To fix this, you’ll need to rename one of the columns so your schema has distinct names regardless of case.

topic Column Name Case sensitivity in DLT pipeline in Data Engineering

Column Name Case sensitivity in DLT pipeline

Re: Column Name Case sensitivity in DLT pipeline