cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Non-existent schema on redeployment of DAB with external volumes.

toast_2001
New Contributor

Hi all,

DAB issue.

My setup:

  • Running CLI v0.294 on Python 3.12.11.
  • Deployement is mode direct and using standard serverless compute.
  • External locations in ADLS ST Container (container per ext loc).

I'm attempting to deploy a bundle according to the following config.

config_file.png

First bundle deployment is successful with no validation errors. On redeployment of the bundle (even with no changes), the deployment fails with the following error (as described by the second image):

Error: cannot skip resources.schemas.landing: failed to read remote state: failed to refresh remote state id=demo.landing: Schema 'demo.landing' does not exist. (404 SCHEMA_DOES_NOT_EXIST)

Results from my debugging:

  • Removal of the resources.volumes block fixes the issue and I can redeploy everytime.
  • On redeployment, the UC seems to "destroy" the schemas rather then skip over unchanged - even though the error suggests a skip was attempted.
  • Hardcoding the values for schema name in the volumes block returns the same error.
  • Removal of the experimental skip_name_prefix_for_schema config did not resolve the issue.

example_error.png

The external volumes represent an ingestion location for databricks to attach to, its kind of crucial they deploy with the DAB for every update, especially if this thing is sitting in production.

Open to any work arounds too.

Thanks for your time.

1 ACCEPTED SOLUTION

Accepted Solutions

Louis_Frolio
Databricks Employee
Databricks Employee

Hi @toast_2001,

I did some digging and have a few helpful tips/tricks to assist your troubleshooting. So let me walk through what's likely happening and what to actually do about it.

The error tells you that on the second deployment, DAB is trying to look up the existing demo.landing schema (because it thinks it can skip it as unchanged), but Unity Catalog is returning a 404 — the schema isn't there when DAB goes to check it. Something is dropping it between runs, or DAB is looking in the wrong place.

Here's where I'd start:

  1. Confirm the schema actually persists between runs. Right after a successful deploy and again just before the next one, run:
DESCRIBE SCHEMA demo.landing;

If it's gone before the second deploy, something outside the bundle — a notebook, a job, a manual step — is dropping it. That's your real problem.

  1. Stop managing the schema in the bundle (easiest fix if the schema is long-lived). If demo.landing is basically a stable container for external ingestion, you don't need DAB to own its lifecycle. Instead:

    • Create demo.landing once, manually or via Terraform.
    • Remove the resources.schemas.landing block from your bundle.
    • In your volume definitions, reference the literal names (demo / landing) instead of the schema resource.

    DAB will then manage only the volumes and assume the catalog + schema already exist. That sidesteps the "skip schema, schema not found" path entirely.

  2. Verify you're targeting the same workspace and metastore both times. SCHEMA_DOES_NOT_EXIST can also surface when the bundle is pointed at a different workspace or metastore on the second run — different profile, different target, different URL. Worth double-checking.

  3. Keep skip_name_prefix_for_schema off. You already tried removing it, which is the right call. That flag is experimental and can affect how resource IDs are computed. Don't bring it back in anything resembling production until you have a stable pattern.

  4. If DAB really needs to own the schema. If demo.landing has to be created and destroyed by this bundle — ephemeral environments, etc. — this may be hitting a current limitation in how DAB refreshes schema state in direct mode when external volumes are involved. If that's the case, open a Databricks Support ticket with:

    • Your bundle YAML (redacted as needed)
    • Workspace URL and metastore name
    • Timestamps of the first successful deploy and the failing redeploy
    • Output from SHOW SCHEMAS IN catalog demo before and after

The most reliable near-term path is option 2 — treat the schema as pre-provisioned infrastructure and let the bundle manage only what lives inside it.

Hope that helps narrow it down.

Cheers, Lou

View solution in original post

2 REPLIES 2

Louis_Frolio
Databricks Employee
Databricks Employee

Hi @toast_2001,

I did some digging and have a few helpful tips/tricks to assist your troubleshooting. So let me walk through what's likely happening and what to actually do about it.

The error tells you that on the second deployment, DAB is trying to look up the existing demo.landing schema (because it thinks it can skip it as unchanged), but Unity Catalog is returning a 404 — the schema isn't there when DAB goes to check it. Something is dropping it between runs, or DAB is looking in the wrong place.

Here's where I'd start:

  1. Confirm the schema actually persists between runs. Right after a successful deploy and again just before the next one, run:
DESCRIBE SCHEMA demo.landing;

If it's gone before the second deploy, something outside the bundle — a notebook, a job, a manual step — is dropping it. That's your real problem.

  1. Stop managing the schema in the bundle (easiest fix if the schema is long-lived). If demo.landing is basically a stable container for external ingestion, you don't need DAB to own its lifecycle. Instead:

    • Create demo.landing once, manually or via Terraform.
    • Remove the resources.schemas.landing block from your bundle.
    • In your volume definitions, reference the literal names (demo / landing) instead of the schema resource.

    DAB will then manage only the volumes and assume the catalog + schema already exist. That sidesteps the "skip schema, schema not found" path entirely.

  2. Verify you're targeting the same workspace and metastore both times. SCHEMA_DOES_NOT_EXIST can also surface when the bundle is pointed at a different workspace or metastore on the second run — different profile, different target, different URL. Worth double-checking.

  3. Keep skip_name_prefix_for_schema off. You already tried removing it, which is the right call. That flag is experimental and can affect how resource IDs are computed. Don't bring it back in anything resembling production until you have a stable pattern.

  4. If DAB really needs to own the schema. If demo.landing has to be created and destroyed by this bundle — ephemeral environments, etc. — this may be hitting a current limitation in how DAB refreshes schema state in direct mode when external volumes are involved. If that's the case, open a Databricks Support ticket with:

    • Your bundle YAML (redacted as needed)
    • Workspace URL and metastore name
    • Timestamps of the first successful deploy and the failing redeploy
    • Output from SHOW SCHEMAS IN catalog demo before and after

The most reliable near-term path is option 2 — treat the schema as pre-provisioned infrastructure and let the bundle manage only what lives inside it.

Hope that helps narrow it down.

Cheers, Lou

Schema does not persist between deployments. On 1st deployment it exists and on redeployment its dropped. The UC either persists or is recreated. I also isolated the environment to have no jobs/pl's, yet the issue remains. 

Logging the deployment didn't help either. 

I've tested a redeployment with no changes down to a 5 second interval between runs and there are no other target workspaces.

I suspect point 5 and will probably pursue a ticket, but wanted to hear the communities feedback. At least theres always manual intervention for now. 

You have been a huge help, thanks.