Hey @Sunil_Poluri , I did some research (learned a few things) and here is what I found.
Unity Catalog manages cloud storage mapping for schemas using internal IDs (schema_id) to ensure data isolation, governance, and uniqueness within a metastoreโeven if schema names are the same across catalogs or across time. Here is a summary of the key factors that influence when new schema_id folders are created under an external location, even if the schema name hasnโt changed:
1. Schema Drop and Re-create
Behavior: Unity Catalog assigns a unique internal identifier (schema_id) to every schema when it is created.
If a schema is dropped and re-createdโeven if the name is identicalโa new schema_id (and thus a new folder) is generated. Old object data persists in the previous folder, but new objects (managed tables) will write to the new schema_id directory.
Implication: This is the most common reason for seeing multiple schema_id folders for a schema name.
2. Publishing Tables via DLT or Pipelines
When using Databricks Delta Live Tables (DLT) pipelines, table storage always adheres to the current mapping of the schemaโs internal ID. If a pipeline (or notebook) triggers creation of a schema that doesnโt yet exist (for example, by referencing it as a target), Unity Catalog creates a new schema and assigns a new schema_id.
If there was a schema deletion and subsequent re-creation outside your awareness (or automation runs at unpredictable times), this could result in the schema_id shifting even if the schema name appears constant.
3. Direct Versus Indirect Schema Creation Channels
Databricks workflows, DLT, Databricks Asset Bundles, and manual UI actions all use the same underlying APIs, but automation (for example, CI/CD-driven schema creation in Asset Bundles or infrastructure-as-code) can lead to unintentional dropping and re-creating of schemas under the hood, causing new IDs to be assigned.
Mistakenly running schema creation logic without โIF NOT EXISTSโ checks may inadvertently replace schemas and (re)generate schema_id folders.
4. Backing Storage or Location Changes
Changing the storage root location property on the schema or re-registering it can also be a scenario where a new schema_id is minted. However, most documentation and troubleshooting guidance emphasize schema drops and re-creations (planned or accidental) as primary drivers.
5. Multiple Metastores or Region/Workspace Boundaries
If running with multiple metastores or cross-region/catalog patterns, schemas with the same name in different metastores are always mapped to distinct internal IDs and thus distinct folders.
6. No Object, No Folder Until First Table
As noted, the schema_id folder is not created in the underlying storage until a managed object (such as a Delta table) is created within the schema. This lazy provisioning is expected behavior for storage efficiency.
Important Additional Notes
The internal IDs are not exposed in user-facing controls; only the folder names in storage and some low-level APIs reveal them. Schema_id changes are not triggered by table creation alone unless the schema itself is new (i.e., it did not exist at the time of table creation).
If you see unexpected new schema_id folders, audit logs, schema version histories, or CI/CD system activity may provide clues (look for drop/create activity).
Hope this helps with your understanding.
Cheers, Louis.