<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Unexpected Schema ID Folder Creation in Unity Catalog External Location in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/unexpected-schema-id-folder-creation-in-unity-catalog-external/m-p/132963#M49688</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/133242"&gt;@Sunil_Poluri&lt;/a&gt;&amp;nbsp;, I did some research (learned a few things) and here is what I found.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Unity Catalog manages cloud storage mapping for schemas using internal IDs (schema_id) to ensure data isolation, governance, and uniqueness within a metastore—even if schema names are the same across catalogs or across time. Here is a summary of the key factors that influence when new schema_id folders are created under an external location, even if the schema name hasn’t changed:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1. Schema Drop and Re-create&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Behavior:&lt;/STRONG&gt; Unity Catalog assigns a unique internal identifier (schema_id) to every schema when it is created.&lt;BR /&gt;If a schema is dropped and re-created—even if the name is identical—a new schema_id (and thus a new folder) is generated. Old object data persists in the previous folder, but new objects (managed tables) will write to the new schema_id directory.&lt;BR /&gt;&lt;STRONG&gt;Implication:&lt;/STRONG&gt; This is the most common reason for seeing multiple schema_id folders for a schema name.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2. Publishing Tables via DLT or Pipelines&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;When using Databricks Delta Live Tables (DLT) pipelines, table storage always adheres to the current mapping of the schema’s internal ID. If a pipeline (or notebook) triggers creation of a schema that doesn’t yet exist (for example, by referencing it as a target), Unity Catalog creates a new schema and assigns a new schema_id.&lt;BR /&gt;If there was a schema deletion and subsequent re-creation outside your awareness (or automation runs at unpredictable times), this could result in the schema_id shifting even if the schema name appears constant.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3. Direct Versus Indirect Schema Creation Channels&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Databricks workflows, DLT, Databricks Asset Bundles, and manual UI actions all use the same underlying APIs, but automation (for example, CI/CD-driven schema creation in Asset Bundles or infrastructure-as-code) can lead to unintentional dropping and re-creating of schemas under the hood, causing new IDs to be assigned.&lt;BR /&gt;Mistakenly running schema creation logic without “IF NOT EXISTS” checks may inadvertently replace schemas and (re)generate schema_id folders.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;4. Backing Storage or Location Changes&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Changing the storage root location property on the schema or re-registering it can also be a scenario where a new schema_id is minted. However, most documentation and troubleshooting guidance emphasize schema drops and re-creations (planned or accidental) as primary drivers.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;5. Multiple Metastores or Region/Workspace Boundaries&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If running with multiple metastores or cross-region/catalog patterns, schemas with the same name in different metastores are always mapped to distinct internal IDs and thus distinct folders.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;6. No Object, No Folder Until First Table&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As noted, the schema_id folder is not created in the underlying storage until a managed object (such as a Delta table) is created within the schema. This lazy provisioning is expected behavior for storage efficiency.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Important Additional Notes&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The internal IDs are not exposed in user-facing controls; only the folder names in storage and some low-level APIs reveal them. Schema_id changes are not triggered by table creation alone unless the schema itself is new (i.e., it did not exist at the time of table creation).&lt;BR /&gt;If you see unexpected new schema_id folders, audit logs, schema version histories, or CI/CD system activity may provide clues (look for drop/create activity).&lt;/P&gt;
&lt;P&gt;Hope this helps with your understanding.&lt;/P&gt;
&lt;P&gt;Cheers, Louis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 24 Sep 2025 17:01:39 GMT</pubDate>
    <dc:creator>Louis_Frolio</dc:creator>
    <dc:date>2025-09-24T17:01:39Z</dc:date>
    <item>
      <title>Unexpected Schema ID Folder Creation in Unity Catalog External Location</title>
      <link>https://community.databricks.com/t5/data-engineering/unexpected-schema-id-folder-creation-in-unity-catalog-external/m-p/124295#M47154</link>
      <description>&lt;P&gt;I've set up Unity Catalog with an external location pointing to a storage account. For each schema, I’ve configured a dedicated container path. For example:&lt;/P&gt;&lt;PRE&gt;abfss://schemas@&amp;lt;storage_account&amp;gt;.dfs.core.windows.net/_unityStorage/schemas/&amp;lt;schema_id&amp;gt;&lt;/PRE&gt;&lt;P&gt;When I create a schema, a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;schema_id&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is generated. I expect this&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;schema_id&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to be reflected as a folder under the schema container path, like:&lt;/P&gt;&lt;PRE&gt;/_unityStorage/schemas/&amp;lt;schema_id&amp;gt;&lt;/PRE&gt;&lt;P&gt;However, I’ve noticed that this folder doesn’t appear immediately—presumably because no objects (like tables) exist yet.&lt;/P&gt;&lt;P&gt;Here’s what I’ve observed:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;When I create a Delta table within the schema, I expect the table data to be stored under the schema’s storage path.&lt;/LI&gt;&lt;LI&gt;Similarly, when I create a DLT pipeline targeting the same schema, I expect the tables to be stored under the same schema path.&lt;/LI&gt;&lt;LI&gt;But instead, a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;new schema ID folder&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;gets created in the storage account under the schema container—even though the schema name is the same.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;My question is:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Under what conditions does Unity Catalog generate a new&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;schema_id&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;folder in the storage account, even when the schema name hasn’t changed?&lt;/P&gt;&lt;P&gt;Any insights or documentation references would be greatly appreciated!&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jul 2025 10:06:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unexpected-schema-id-folder-creation-in-unity-catalog-external/m-p/124295#M47154</guid>
      <dc:creator>Sunil_Poluri</dc:creator>
      <dc:date>2025-07-07T10:06:41Z</dc:date>
    </item>
    <item>
      <title>Re: Unexpected Schema ID Folder Creation in Unity Catalog External Location</title>
      <link>https://community.databricks.com/t5/data-engineering/unexpected-schema-id-folder-creation-in-unity-catalog-external/m-p/132963#M49688</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/133242"&gt;@Sunil_Poluri&lt;/a&gt;&amp;nbsp;, I did some research (learned a few things) and here is what I found.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Unity Catalog manages cloud storage mapping for schemas using internal IDs (schema_id) to ensure data isolation, governance, and uniqueness within a metastore—even if schema names are the same across catalogs or across time. Here is a summary of the key factors that influence when new schema_id folders are created under an external location, even if the schema name hasn’t changed:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1. Schema Drop and Re-create&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Behavior:&lt;/STRONG&gt; Unity Catalog assigns a unique internal identifier (schema_id) to every schema when it is created.&lt;BR /&gt;If a schema is dropped and re-created—even if the name is identical—a new schema_id (and thus a new folder) is generated. Old object data persists in the previous folder, but new objects (managed tables) will write to the new schema_id directory.&lt;BR /&gt;&lt;STRONG&gt;Implication:&lt;/STRONG&gt; This is the most common reason for seeing multiple schema_id folders for a schema name.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2. Publishing Tables via DLT or Pipelines&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;When using Databricks Delta Live Tables (DLT) pipelines, table storage always adheres to the current mapping of the schema’s internal ID. If a pipeline (or notebook) triggers creation of a schema that doesn’t yet exist (for example, by referencing it as a target), Unity Catalog creates a new schema and assigns a new schema_id.&lt;BR /&gt;If there was a schema deletion and subsequent re-creation outside your awareness (or automation runs at unpredictable times), this could result in the schema_id shifting even if the schema name appears constant.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3. Direct Versus Indirect Schema Creation Channels&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Databricks workflows, DLT, Databricks Asset Bundles, and manual UI actions all use the same underlying APIs, but automation (for example, CI/CD-driven schema creation in Asset Bundles or infrastructure-as-code) can lead to unintentional dropping and re-creating of schemas under the hood, causing new IDs to be assigned.&lt;BR /&gt;Mistakenly running schema creation logic without “IF NOT EXISTS” checks may inadvertently replace schemas and (re)generate schema_id folders.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;4. Backing Storage or Location Changes&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Changing the storage root location property on the schema or re-registering it can also be a scenario where a new schema_id is minted. However, most documentation and troubleshooting guidance emphasize schema drops and re-creations (planned or accidental) as primary drivers.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;5. Multiple Metastores or Region/Workspace Boundaries&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If running with multiple metastores or cross-region/catalog patterns, schemas with the same name in different metastores are always mapped to distinct internal IDs and thus distinct folders.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;6. No Object, No Folder Until First Table&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As noted, the schema_id folder is not created in the underlying storage until a managed object (such as a Delta table) is created within the schema. This lazy provisioning is expected behavior for storage efficiency.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Important Additional Notes&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The internal IDs are not exposed in user-facing controls; only the folder names in storage and some low-level APIs reveal them. Schema_id changes are not triggered by table creation alone unless the schema itself is new (i.e., it did not exist at the time of table creation).&lt;BR /&gt;If you see unexpected new schema_id folders, audit logs, schema version histories, or CI/CD system activity may provide clues (look for drop/create activity).&lt;/P&gt;
&lt;P&gt;Hope this helps with your understanding.&lt;/P&gt;
&lt;P&gt;Cheers, Louis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Sep 2025 17:01:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/unexpected-schema-id-folder-creation-in-unity-catalog-external/m-p/132963#M49688</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-09-24T17:01:39Z</dc:date>
    </item>
  </channel>
</rss>

