Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-11-2025 06:19 AM
Dynamic Task Mapping
Databricks Workflows offers a similar concept to Airflow's dynamic task mapping through the "For each" task type
This allows you to run a task in a loop, passing different parameters to each iteration. Here's how you can replicate the functionality of Airflow's
.expand() function:- Create a "For each" task in your Databricks Workflow.
- Define the iterable items (similar to what you'd pass to
.expand()in Airflow). - Specify a nested task that will be executed for each item in the iterable.
For example, if you have a list of dates to process, you could set up a "For each" task that iterates over these dates and runs a notebook or Python wheel for each one.
Reference: https://docs.databricks.com/en/jobs/for-each.html
In Databricks Workflows, there isn't a direct equivalent to Airflow's get_current_context() function. However, you can access similar information through different means:
- Job Parameters: You can define job-level parameters that are accessible to all tasks within the workflow.
- Task Values: Databricks Workflows supports "Task Values," which allow you to set and retrieve small values from tasks. This can be used to pass information between tasks in a workflow.
- Dynamic Values: Databricks Workflows supports dynamic value references, which allow you to access certain runtime information. For example:
{{job.run_id}}gives you the current job run ID{{job.start_time}}provides the job start time
- Notebook Parameters: If you're using notebook tasks, you can pass parameters to the notebook, which can include runtime information.
Reference: https://docs.databricks.com/en/jobs/job-parameters.html and https://docs.databricks.com/en/jobs/task-parameters.html