cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Asset Bundles: using loops

Daan
New Contributor III

Hey,

I am using DAB's to deploy the job below.
This code works but I would like to use it for other suppliers as well.
Is there a way to loop over a loop of suppliers: ['nike', 'adidas',...] and fill those variables so that 

config_nike_gsheet_to_databricks is dynamically filled in like config_{supplier}_gsheet_to_databricks. 
bundle:
  name: gsheet-config-jobs
  
resources:
  jobs:
    config_nike_gsheet_to_databricks:
      name: config_nike_gsheet_to_databricks
      tasks:
        - task_key: test_config_nike_gsheet_to_databricks
          spark_python_task:
            python_file: src/mapping-update/main.py
            parameters:
              - ACC
              - NIKE
            source: GIT
          environment_key: Default
      git_source:
        git_url: https://github.com/Transfo-Energy/transfo-engine.git
        git_provider: gitHub
        git_branch: Acceptance
      queue:
        enabled: true
      environments:
        - environment_key: Default
          spec:
            client: "1"
            dependencies:
              - gspread
              - oauth2client
      performance_target: PERFORMANCE_OPTIMIZED

 
Thanks a lot for the help!

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Oh, now I get it. I've misunderstood your question initially. So in your case you need to build your DAB definition dynamically. You can use Python for Databricks Assets Bundles and dynamically create jobs or pipelines using metadata

Bundle configuration in Python | Databricks Documentation

View solution in original post

6 REPLIES 6

szymon_dybczak
Esteemed Contributor III
Last week saw the General Availability of dynamic functionality for Databricks Workflows, in the form of the parameterized ForEach activity, but what does that mean? And why should we care? For a long time we've been using external orchestration tools whenever things had to be flexible, metadata ...

Daan
New Contributor III

Hey szymon,

Your answer is to use a for each loop inside a job/workflow. 
What I would like to achieve is to create multiple jobs using one Databricks Asset Bundle by iterating over a list of suppliers.

 

szymon_dybczak
Esteemed Contributor III

Oh, now I get it. I've misunderstood your question initially. So in your case you need to build your DAB definition dynamically. You can use Python for Databricks Assets Bundles and dynamically create jobs or pipelines using metadata

Bundle configuration in Python | Databricks Documentation

szymon_dybczak
Esteemed Contributor III

And I think this is excatly what you're looking for @Daan :

szymon_dybczak_0-1755515177551.png

 

Bundle configuration in Python | Databricks Documentation

MujtabaNoori
New Contributor III

Hi @Daan  ,
Your requirement is to create jobs dynamically by iterating through a list of suppliers. This is definitely achievable using the Databricks SDK.
Iโ€™d recommend providing the job parameters and definitions in a JSON format, as itโ€™s more reliable for parsing. You can structure the job name with an identifier, for example:
```config_{{source}}_gsheet_to_databricks```
Then, within your loop, you can safely replace {{source}} with each supplier value from the iterator and create the jobs dynamically.

MujtabaNoori
New Contributor III

@Daan ,
You can maintain a default template that holds the common configuration. While creating a job-specific configuration, you can safely merge your job-specific dictionary with the base template using the | operator.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now