a month ago
I have a job that is triggered from changes in a table.
Table is located in the home catalog.
I have multiple environments, that are not predefined (created on the fly)
Some process writes into a table, and then job starts processing this table.
I have a generic target
targets:
client-workspace:
workspace:
root_path: ${var.bundle_path}
mode: productionI don't specify host, because it is not pre-determined.
trigger:
pause_status: UNPAUSED
table_update:
table_names:
- {WORKSPACE_NAME}.schema.tableand here I hit a roadblock, as there is no way to figure WORKSPACE_NAME, unless configured which means I hardcode each dynamically created workspace.
I understand there is a way round using CLI and providing the parameter in the command line.
Is there a better solution that would enable deployment from UI?
2 weeks ago
Thanks for the detailed question. This is a common challenge when working with Databricks Asset Bundles (DABs) across dynamically provisioned workspaces where the catalog name varies per environment.
You are right that there is currently no built-in bundle substitution like ${workspace.default_catalog} or ${workspace.catalog} that would automatically resolve to the workspace's home catalog at deploy time. The available built-in substitutions are limited to things like ${workspace.host}, ${workspace.current_user.userName}, ${bundle.name}, ${bundle.target}, etc.
Here are your options, starting with the ones that work best for dynamic environments:
OPTION 1: VARIABLE-OVERRIDES FILE (BEST FOR UI DEPLOYMENT)
Since you want to deploy from the UI (the bundle deployment dialog), the variable-overrides.json approach may be your best bet. You can define a variable in your databricks.yml:
variables:
catalog_name:
description: "The home catalog for this workspace"
targets:
client-workspace:
workspace:
root_path: ${var.bundle_path}
mode: production
resources:
jobs:
my_job:
trigger:
pause_status: UNPAUSED
table_update:
table_names:
- ${var.catalog_name}.schema.table
Then create the file .databricks/bundle/client-workspace/variable-overrides.json in your bundle project with:
{
"catalog_name": "my_workspace_catalog"
}
This file can be generated as part of your workspace provisioning process. When the workspace is dynamically created, the provisioning script can also write this overrides file. This way the YAML stays generic and the environment-specific value is in a separate file.
Docs: https://docs.databricks.com/en/dev-tools/bundles/variables.html
OPTION 2: ENVIRONMENT VARIABLES
If you have any automation that runs before deployment (even from the UI), you can set:
export BUNDLE_VAR_catalog_name=my_workspace_catalog
Environment variables take priority over defaults and variable-overrides.json. This works well if your deployment environment (e.g., a CI runner or a notebook that triggers deployment) can set env vars.
Docs: https://docs.databricks.com/en/dev-tools/bundles/variables.html
OPTION 3: TARGET-SPECIFIC VARIABLES
If you know the catalog names at the time you define targets, you can hardcode them per target:
variables:
catalog_name:
description: "The home catalog for this workspace"
targets:
workspace-alpha:
variables:
catalog_name: alpha_catalog
workspace:
host: https://alpha.cloud.databricks.com
workspace-beta:
variables:
catalog_name: beta_catalog
workspace:
host: https://beta.cloud.databricks.com
But as you noted, this requires predefined targets, which does not work for dynamically created workspaces.
OPTION 4: CLI VARIABLE OVERRIDE (WHAT YOU ALREADY KNOW)
For completeness, the CLI approach:
databricks bundle deploy --target client-workspace --var="catalog_name=my_workspace_catalog"
This is the most straightforward for CI/CD pipelines but does not work for pure UI-based deployment.
REGARDING THE CORE LIMITATION
Unfortunately, there is no built-in substitution that automatically resolves the workspace's default catalog. The bundle substitution system (as of CLI v0.283+) provides workspace.host, workspace.root_path, workspace.current_user.*, bundle.name, bundle.target, and bundle.git.*, but nothing for the catalog.
This is a known gap. If you need this capability natively, I would encourage you to file a feature request on the Databricks CLI GitHub repository:
https://github.com/databricks/cli/issues
A related issue exists for table triggers requiring tables to already exist at deploy time (GitHub issue #4437), which compounds this problem for CI/CD scenarios.
RECOMMENDED APPROACH FOR YOUR USE CASE
Given your dynamic workspace provisioning scenario, I would recommend:
1. Add a variable for the catalog name in your databricks.yml (no default value, so deployment fails loudly if it is not set).
2. As part of your workspace provisioning automation, generate the variable-overrides.json file with the correct catalog name for that workspace.
3. Use ${var.catalog_name}.schema.table in your trigger configuration.
This keeps your bundle YAML fully generic and portable across any number of dynamically created workspaces.
REFERENCES
- Bundle variables and substitutions: https://docs.databricks.com/en/dev-tools/bundles/variables.html
- Bundle configuration settings reference: https://docs.databricks.com/en/dev-tools/bundles/settings.html
- Bundle deployment modes: https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
- Related GitHub issue (table trigger validation): https://github.com/databricks/cli/issues/4437
Hope this helps. Let me know if you have follow-up questions about the variable-overrides approach or need help with the provisioning automation piece.
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
a month ago
Hi @Dimitry .
With a CI/CD pipeline you can override a default parameter of a databricks.yml passing an argument during the deployment. For example to override a Service Principal that can be different between envs you can do in this way:
- name: Deploy asset bundle to dev
working-directory: ./asset_bundle_directory
run: databricks bundle deploy --auto-approve --target dev --var="service_principal_id=${{ secrets.DATABRICKS_CLIENT_ID }}"
trigger:
pause_status: UNPAUSED
table_update:
table_names:
- ${var.catalog}.${var.schema}.table
This process change the databrikcs.yml with the right parameters and deploy the result. If you want to deploy from UI you need to change the databrciks.yml parameters there is no way to do this dinamically. Usually the targets are also used to as many as developer there are, in this way every developer can you their parametrs but for the dev and prod evns there should be a unique way to deploy and change the params.
Let me know if it's clear or not and if you still need support-
a month ago
Thanks for the details, but I mentioned that pathway:
>>I understand there is a way round using CLI and providing the parameter in the command line.
My question is "Is there a better solution that would enable deployment from the UI?" Perhaps I was not clear, so I attached the screenshot.
a month ago
For my knowledge there isn't another solution to deploy the Asset Bundle from UI.
2 weeks ago
Thanks for the detailed question. This is a common challenge when working with Databricks Asset Bundles (DABs) across dynamically provisioned workspaces where the catalog name varies per environment.
You are right that there is currently no built-in bundle substitution like ${workspace.default_catalog} or ${workspace.catalog} that would automatically resolve to the workspace's home catalog at deploy time. The available built-in substitutions are limited to things like ${workspace.host}, ${workspace.current_user.userName}, ${bundle.name}, ${bundle.target}, etc.
Here are your options, starting with the ones that work best for dynamic environments:
OPTION 1: VARIABLE-OVERRIDES FILE (BEST FOR UI DEPLOYMENT)
Since you want to deploy from the UI (the bundle deployment dialog), the variable-overrides.json approach may be your best bet. You can define a variable in your databricks.yml:
variables:
catalog_name:
description: "The home catalog for this workspace"
targets:
client-workspace:
workspace:
root_path: ${var.bundle_path}
mode: production
resources:
jobs:
my_job:
trigger:
pause_status: UNPAUSED
table_update:
table_names:
- ${var.catalog_name}.schema.table
Then create the file .databricks/bundle/client-workspace/variable-overrides.json in your bundle project with:
{
"catalog_name": "my_workspace_catalog"
}
This file can be generated as part of your workspace provisioning process. When the workspace is dynamically created, the provisioning script can also write this overrides file. This way the YAML stays generic and the environment-specific value is in a separate file.
Docs: https://docs.databricks.com/en/dev-tools/bundles/variables.html
OPTION 2: ENVIRONMENT VARIABLES
If you have any automation that runs before deployment (even from the UI), you can set:
export BUNDLE_VAR_catalog_name=my_workspace_catalog
Environment variables take priority over defaults and variable-overrides.json. This works well if your deployment environment (e.g., a CI runner or a notebook that triggers deployment) can set env vars.
Docs: https://docs.databricks.com/en/dev-tools/bundles/variables.html
OPTION 3: TARGET-SPECIFIC VARIABLES
If you know the catalog names at the time you define targets, you can hardcode them per target:
variables:
catalog_name:
description: "The home catalog for this workspace"
targets:
workspace-alpha:
variables:
catalog_name: alpha_catalog
workspace:
host: https://alpha.cloud.databricks.com
workspace-beta:
variables:
catalog_name: beta_catalog
workspace:
host: https://beta.cloud.databricks.com
But as you noted, this requires predefined targets, which does not work for dynamically created workspaces.
OPTION 4: CLI VARIABLE OVERRIDE (WHAT YOU ALREADY KNOW)
For completeness, the CLI approach:
databricks bundle deploy --target client-workspace --var="catalog_name=my_workspace_catalog"
This is the most straightforward for CI/CD pipelines but does not work for pure UI-based deployment.
REGARDING THE CORE LIMITATION
Unfortunately, there is no built-in substitution that automatically resolves the workspace's default catalog. The bundle substitution system (as of CLI v0.283+) provides workspace.host, workspace.root_path, workspace.current_user.*, bundle.name, bundle.target, and bundle.git.*, but nothing for the catalog.
This is a known gap. If you need this capability natively, I would encourage you to file a feature request on the Databricks CLI GitHub repository:
https://github.com/databricks/cli/issues
A related issue exists for table triggers requiring tables to already exist at deploy time (GitHub issue #4437), which compounds this problem for CI/CD scenarios.
RECOMMENDED APPROACH FOR YOUR USE CASE
Given your dynamic workspace provisioning scenario, I would recommend:
1. Add a variable for the catalog name in your databricks.yml (no default value, so deployment fails loudly if it is not set).
2. As part of your workspace provisioning automation, generate the variable-overrides.json file with the correct catalog name for that workspace.
3. Use ${var.catalog_name}.schema.table in your trigger configuration.
This keeps your bundle YAML fully generic and portable across any number of dynamically created workspaces.
REFERENCES
- Bundle variables and substitutions: https://docs.databricks.com/en/dev-tools/bundles/variables.html
- Bundle configuration settings reference: https://docs.databricks.com/en/dev-tools/bundles/settings.html
- Bundle deployment modes: https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
- Related GitHub issue (table trigger validation): https://github.com/databricks/cli/issues/4437
Hope this helps. Let me know if you have follow-up questions about the variable-overrides approach or need help with the provisioning automation piece.
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
yesterday
Thanks for the comprehensive answer. Very well explained.