<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: spark is case sensitive? Spark is not case sensitive by default. If you have same column name in in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/119558#M45909</link>
    <description>&lt;P&gt;I'm facing same issue in DLT pipeline. Did you find a fix for this?&lt;/P&gt;</description>
    <pubDate>Sun, 18 May 2025 15:50:56 GMT</pubDate>
    <dc:creator>AmanSehgal</dc:creator>
    <dc:date>2025-05-18T15:50:56Z</dc:date>
    <item>
      <title>spark is case sensitive? Spark is not case sensitive by default. If you have same column name in different case (Name, name), if you try to select eit...</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/13954#M8529</link>
      <description>&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/53516874/is-spark-sql-like-case-sensitive" alt="https://stackoverflow.com/questions/53516874/is-spark-sql-like-case-sensitive" target="_blank"&gt;spark is case sensitive?&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Spark is not case sensitive by default. If you have same column name in different case (Name, name), if you try to select either "Name" or "name" column you will get column ambiguity error.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There is a way to handle this issue by adding spark config , using a&amp;nbsp;SparkSession object named&amp;nbsp;spark:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.conf.set('spark.sql.caseSensitive', True)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;By default it is False.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jan 2023 14:30:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/13954#M8529</guid>
      <dc:creator>ramravi</dc:creator>
      <dc:date>2023-01-02T14:30:29Z</dc:date>
    </item>
    <item>
      <title>Re: spark is case sensitive? Spark is not case sensitive by default. If you have same column name in</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/59050#M31322</link>
      <description>&lt;P&gt;Hi, even though i set the conf to be true, on writing to disk it had exceptions complaining it has duplicate columns.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;below is the error message

org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the data to save: branchavailablity.element.salesleadtime
	at org.apache.spark.sql.delta.DeltaAnalysisException$.apply(DeltaSharedExceptions.scala:57)
	at org.apache.spark.sql.delta.schema.SchemaMergingUtils$.checkColumnNameDuplication(SchemaMergingUtils.scala:117)
	at org.apache.spark.sql.delta.schema.SchemaMergingUtils$.mergeSchemas(SchemaMergingUtils.scala:160)
	at org.apache.spark.sql.delta.schema.ImplicitMetadataOperation$.mergeSchema(ImplicitMetadataOperation.scala:161)
	at org.apache.spark.sql.delta.schema.ImplicitMetadataOperation.updateMetadata(ImplicitMetadataOperation.scala:64)
	at org.apache.spark.sql.delta.schema.ImplicitMetadataOperation.updateMetadata$(ImplicitMetadataOperation.scala:52)
	at org.apache.spark.sql.delta.commands.WriteIntoDelta.updateMetadata(WriteIntoDelta.scala:70)
	at org.apache.spark.sql.delta.commands.WriteIntoDelta.write(WriteIntoDelta.scala:137)
	at org.apache.spark.sql.delta.commands.WriteIntoDelta.$anonfun$run$1(WriteIntoDelta.scala:95)
	at org.apache.spark.sql.delta.commands.WriteIntoDelta.$anonfun$run$1$adapted(WriteIntoDelta.scala:90)
	at org.apache.spark.sql.delta.DeltaLog.withNewTransaction(DeltaLog.scala:255)
	at org.apache.spark.sql.delta.commands.WriteIntoDelta.run(WriteIntoDelta.scala:90)
	at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:161)
	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457)
	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78)
	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:115)
	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:848)
	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)
	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:349)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
	at com.myCompany.myProject.myMethod(WriteToDisk.scala:51)&lt;/LI-CODE&gt;</description>
      <pubDate>Fri, 02 Feb 2024 11:58:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/59050#M31322</guid>
      <dc:creator>source2sea</dc:creator>
      <dc:date>2024-02-02T11:58:20Z</dc:date>
    </item>
    <item>
      <title>Re: spark is case sensitive? Spark is not case sensitive by default. If you have same column name in</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/108985#M43193</link>
      <description>&lt;P&gt;Hi I had similar issues with parquet files when trying to query athena,&amp;nbsp;&lt;/P&gt;&lt;P&gt;fix was i had to inspect the parquet file since it contained columns such as "&lt;SPAN&gt;Name", "name" which the aws crawler / athena would interpret as a duplicate column since it would see "name" and "name" or "Name" and "Name".&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2025 16:39:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/108985#M43193</guid>
      <dc:creator>zerospeed</dc:creator>
      <dc:date>2025-02-05T16:39:15Z</dc:date>
    </item>
    <item>
      <title>Re: spark is case sensitive? Spark is not case sensitive by default. If you have same column name in</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/119558#M45909</link>
      <description>&lt;P&gt;I'm facing same issue in DLT pipeline. Did you find a fix for this?&lt;/P&gt;</description>
      <pubDate>Sun, 18 May 2025 15:50:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-is-case-sensitive-spark-is-not-case-sensitive-by-default/m-p/119558#M45909</guid>
      <dc:creator>AmanSehgal</dc:creator>
      <dc:date>2025-05-18T15:50:56Z</dc:date>
    </item>
  </channel>
</rss>

