Databricks Community

Mathias · ‎10-26-2023

When generating the standard setup with databricks bundle init we will get databricks.yml that references resources/*. The targets are set in the databricks.yml and the resources (pipelines and jobs) are set in different files.

I have dlt pipelines that I want to run continuously in the production workspace but, to save cost, want to run triggered in the dev workspace. My pipelines in Azure Devops deploy the code first to dev and then to prod using the databricks bundle deploy -t xxx.

Is there a best practice for how to implement the differences?

I tried adding an if statement but it doesn't seem to be working

${{ if eq(${bundle.environment}, 'dev') }}:

continuous: false

${{ elseif eq(${bundle.environment}, 'prod' ) }}:

continuous: true

pietern · ‎03-01-2024

@Retired_mod The YAML snippets you included aren't valid bundle configurations. The `-f` flag also doesn't exist.

DABs are always deployed in their entirety, not selectively.

The configuration syntax reference can be found here: https://docs.databricks.com/en/dev-tools/bundles/settings.html

Mathias · ‎11-02-2023

In our case we are using build pipelines in Azure Devops do deploy the solution, which consist or multiple different jobs and dlt pipelines. We are using Databricks bundles as created by databricks bundle init, which creates one databricks.yml and a Resources -folder, where the individual pipeline specifications reside.

To give context, our build pipeline looks like this:

stage: Release_To_Dev

variables:

- group: databricks-dev-env

jobs:

- template: deploy-steps.yml

parameters:

environment: dev

- stage: Release_To_Production

dependsOn: Release_To_Dev

variables:

- group: databricks-prod-env

condition: and(succeeded('Release_To_Dev'), eq(variables['Build.SourceBranch'], 'refs/heads/main'))

jobs:

- template: deploy-steps.yml

parameters:

environment: prod

The deploy-steps.yml looks like this:

parameters:

- name: Environment

jobs:

- job: BuildAndRun

- script: |

curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

displayName: 'Install Databricks CLI'

- script: |

/usr/local/bin/databricks bundle deploy --target ${{ parameters.environment }}

env:

DATABRICKS_HOST: $(databricks_host)

ARM_TENANT_ID: $(arm_tenant_id)

ARM_CLIENT_ID: $(arm_client_id)

ARM_CLIENT_SECRET: $(arm_client_secret)

displayName: 'Deploy the pipeline, jobs and code files defined in the bundle'

Any recommendations on how to specify different settings per environment in this case?

pietern · ‎03-01-2024

It is possible to use target overrides to customize resources based on the target you're deploying to.

Documentation can be found here: https://docs.databricks.com/en/dev-tools/bundles/settings.html#targets

An example of this pattern can be found here: https://github.com/databricks/cli/blob/main/bundle/tests/override_pipeline_cluster/databricks.yml

Instead of the `clusters` or `name` field, you would include the `continuous` field.

JacekLaskowski · ‎07-17-2024

That's exactly my words! I'd not be surprised if this were the author of DAB judging by the nickname (https://github.com/databricks/cli/commits?author=pietern) 😉