<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Case insensitive data in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144271#M52302</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/108745"&gt;@dpc&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Like&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/176516"&gt;@emma_s&lt;/a&gt;&amp;nbsp; mentioned - you can set it at catalog/schema. Table will inherit collation. But you can also define it explicitly for a specific column within a table using collate modifier&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="szymon_dybczak_1-1768585044149.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/23037iD444A03365F38B74/image-size/medium?v=v2&amp;amp;px=400" role="button" title="szymon_dybczak_1-1768585044149.png" alt="szymon_dybczak_1-1768585044149.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Jan 2026 17:37:35 GMT</pubDate>
    <dc:creator>szymon_dybczak</dc:creator>
    <dc:date>2026-01-16T17:37:35Z</dc:date>
    <item>
      <title>Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144156#M52273</link>
      <description>&lt;P&gt;For all it's positives, one of the first general issues we had with databricks was case sensitivity.&lt;/P&gt;&lt;P&gt;We have a lot of data specific filters in our code&lt;/P&gt;&lt;P&gt;Problem is, we land and view data from lots of different case insensitive source systems e.g. SQL Server&lt;/P&gt;&lt;P&gt;As such, we have to be very careful with our code and convert columns to UPPER when making a comparison.&lt;/P&gt;&lt;P&gt;Most of our code is written in SQL.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;About 18 months ago I asked whether there was going to be a catalog, schema or table setting for this i.e. make the object case insensitive.&lt;/P&gt;&lt;P&gt;I was told it was on its way.&lt;/P&gt;&lt;P&gt;Not heard anything since and cannot find anything.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anybody know whether this is in place or expected?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jan 2026 12:39:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144156#M52273</guid>
      <dc:creator>dpc</dc:creator>
      <dc:date>2026-01-15T12:39:15Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144159#M52274</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/108745"&gt;@dpc&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I think you can try to use a collation for that purpose.&amp;nbsp;&lt;SPAN&gt;A collation is a set of rules that determines how string comparisons are performed. Collations are used to compare strings in a case-insensitive, accent-insensitive, or trailing space insensitive manner, or to sort strings in a specific language-aware order.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-collation" target="_blank" rel="noopener"&gt;Collation | Databricks on AWS&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jan 2026 13:13:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144159#M52274</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2026-01-15T13:13:03Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144263#M52296</link>
      <description>&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Collation is table specific though isn't it? and you have to apply it to each columns.&lt;/P&gt;&lt;P&gt;Is there a was to just say, this schema, catalog or table is case insensitive or can you only do it by column?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jan 2026 16:25:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144263#M52296</guid>
      <dc:creator>dpc</dc:creator>
      <dc:date>2026-01-16T16:25:25Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144267#M52300</link>
      <description>&lt;P&gt;Hi, You can set the default collation at Catalog level&amp;nbsp; or schema level and the tables in the catalog will inherit the collation. This is supported from DBR 17.1 and above.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jan 2026 17:30:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144267#M52300</guid>
      <dc:creator>emma_s</dc:creator>
      <dc:date>2026-01-16T17:30:32Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144271#M52302</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/108745"&gt;@dpc&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Like&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/176516"&gt;@emma_s&lt;/a&gt;&amp;nbsp; mentioned - you can set it at catalog/schema. Table will inherit collation. But you can also define it explicitly for a specific column within a table using collate modifier&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="szymon_dybczak_1-1768585044149.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/23037iD444A03365F38B74/image-size/medium?v=v2&amp;amp;px=400" role="button" title="szymon_dybczak_1-1768585044149.png" alt="szymon_dybczak_1-1768585044149.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jan 2026 17:37:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144271#M52302</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2026-01-16T17:37:35Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144517#M52331</link>
      <description>&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;I'll test collation at catalog, sschema and table level using 17.1&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2026 08:59:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144517#M52331</guid>
      <dc:creator>dpc</dc:creator>
      <dc:date>2026-01-20T08:59:28Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144891#M52404</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/176516"&gt;@emma_s&lt;/a&gt;&amp;nbsp; and&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&lt;/P&gt;&lt;P&gt;Looks positive.&lt;/P&gt;&lt;P&gt;Some datatype issues moving to 17.1 but that's a separate issue.&lt;/P&gt;&lt;P&gt;I've run some simple tests and it works well so will trial it on out full data set&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2026 15:06:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144891#M52404</guid>
      <dc:creator>dpc</dc:creator>
      <dc:date>2026-01-22T15:06:43Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144894#M52407</link>
      <description>&lt;P&gt;Great&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/108745"&gt;@dpc&lt;/a&gt;&amp;nbsp;, good to hear that &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2026 15:27:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144894#M52407</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2026-01-22T15:27:01Z</dc:date>
    </item>
    <item>
      <title>Re: Case insensitive data</title>
      <link>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144979#M52430</link>
      <description>&lt;P&gt;It works but there's a scenario that causes an issue.&lt;/P&gt;&lt;P&gt;If I create a schema with&amp;nbsp;&lt;STRONG&gt;&lt;SPAN&gt;default&lt;/SPAN&gt;&lt;SPAN&gt;collation&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN&gt; UTF8_LCASE&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Then create a table, it marks all the string columns as&amp;nbsp;UTF8_LCASE&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Which is fine and works&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I create the table, in the newly created&amp;nbsp;&lt;SPAN&gt;UTF8_LCASE schema from an existing table using:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;CREATE&lt;/SPAN&gt; &lt;SPAN&gt;TABLE&lt;/SPAN&gt;&amp;nbsp;&amp;lt;destination&amp;gt;&lt;SPAN&gt;&amp;nbsp;AS&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;FROM&lt;/SPAN&gt;&amp;nbsp;&amp;lt;source&amp;gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;WHERE&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;2&lt;/SPAN&gt;&lt;SPAN&gt;;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;The table is UTF8_LCASE but all the columns aren't&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;So, when&amp;nbsp; I use it, it remains case sensitive&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Any thoughts?&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Am I doing something wrong here as I really want to create tables that are structurally the same with the exception of case&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I am doing this as I land the data and the initial table is derived from the landed structure, before moving it.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 23 Jan 2026 09:45:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/case-insensitive-data/m-p/144979#M52430</guid>
      <dc:creator>dpc</dc:creator>
      <dc:date>2026-01-23T09:45:26Z</dc:date>
    </item>
  </channel>
</rss>

