07-15-2024 07:23 PM - edited 07-15-2024 08:45 PM
HI,
We've got an app that we deploy to multiple customers workspaces.
We're looking to transition to asset bundles. We would like to structure our resources like:
-src/
-resources/
|-- customer_1/
|-- job_1
|-- job_2
|-- customer_2/
|-- job_3
|-- job_4
|-- customer_3/
|-- job_5
|-- job_6
- databricks.yml
so each customer only gets their specific workflows.
In `databricks.yml` would love to do:
bundle:
name: my_app
include:
- resources/${bundle.target}/*.yml
but unfortunately it seems you can't pass any variables into the `include` block.
Any ideas?
Thanks!
07-15-2024 08:01 PM
Just to add to the above, we've got about 30-40 jobs per customer, so we don't want to define these jobs in the `databricks.yml`, which seems to be the only way to navigate around this issue
07-16-2024 01:41 AM
Interesting use case!! Ideally having seperate bundle for each customer seems like a clean solution. But if you dont want that then -
You can just include all the yaml files in databricks.yml with
include:
- resources/*/*.yml
Inside the yaml files handle different workspaces under different targets ? target is a top level node which is merged so you can have 'customer_1', 'customer_2' as different targets ?
This is not the intended use of targets but I guess this will solve the ask.
10-21-2024 11:04 AM
If the jobs you're defining per customer are completely different and don't share anything (e.g. some base configuration), then using targets for this purpose is not a great fit.
If you're looking to share files between these bundles (maybe some notebooks, maybe a wheel build, maybe some Python files, etc), then you could look into this example: https://github.com/databricks/bundle-examples/tree/main/knowledge_base/share_files_across_bundles
This enables you do define a separate and isolated bundle per customer, with a different deployment schedule for each one of them, but still share the same set of files between these deployments.
10-23-2024 03:25 PM
I have a similar use case. We have two different host for databricks, EU and NA. In some case we need to deploy a similar job in both hosts. To fix that, here how I did:
- Into job folder I created different job files, each one for one host. In aditional I created an empty job file, named job.yml
- src/
- jobs/
|--- job_EU.yml (job configuration for EU dbx)
|--- job_NA.yml (job configuration for NA dbx)
|--- job.yml (empty file)
- In databricks.yml I include only the empty file:
include:
- job.yml
- Finally, at workflow file, in each job for databricks commands I edit the empty file pasting the content of the specific job file, depending on the host I want to deploy and run. This step must be done in each job that calls a databricks bundle command. If they are at the same job, just needed once:
deploy:
.
.
.
steps:
.
.
.
- run: cat jobs/job_EU.yml > jobs/job.yml
- run: databricks bundle deploy
10-23-2024 11:38 PM
Hi Breno,
If the jobs you're deploying are similar, it sounds like a good fit for target overrides. You'd define the core of the job just one, and specialize it on a per-target basis. For example, you could define a "eu" and "na" target, each pointing to their own workspace, and specialize the job in the target overrides section. You can find an example on how to do this for job clusters here: https://docs.databricks.com/en/dev-tools/bundles/cluster-override.html#example-2-conflicting-new-job... . All job properties can be overridden in a similar way.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group