parallel run in job pipeline
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-13-2022 05:30 PM
I am trying to build a pipeline which deploys a ML model, and I want to build the pipeline in Workflow/jobs.
In task of prediction of the model, I have hundreds of groups of input features, I use a for loop to get one group of input features and do prediction each time. Those groups are independent and the sequence of running doesn't matter. I want to set a threshold like 10, and kick off several parallel runs, each run will do prediction of 10 groups of input features. (If there are 100 groups, then 10 parallel runs; if there are 175 groups, then 18 runs).
Is there any method to make one take of a pipeline kick off several runs with different parameters and the number of runs is decided by input data size?
- Labels:
-
DatabricksJobs
-
Job
-
Parallel Runs