<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic parallel run in job pipeline in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/parallel-run-in-job-pipeline/m-p/16967#M907</link>
    <description>&lt;P&gt;I am trying to build a  pipeline which deploys a ML model, and I want to build the pipeline in Workflow/jobs. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In task of prediction of the model, I have hundreds of groups of input features, I use a for loop to get one group of input features and do prediction each time. Those groups are independent and the sequence of running doesn't matter. I want to set a threshold like 10, and kick off several parallel runs, each run will do prediction of 10 groups of input features. (If there are 100 groups, then 10 parallel runs; if there are 175 groups, then 18 runs). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any method to make one take of a pipeline kick off several runs with different parameters and the number of runs is decided by input data size?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 14 Dec 2022 01:30:13 GMT</pubDate>
    <dc:creator>Geeya</dc:creator>
    <dc:date>2022-12-14T01:30:13Z</dc:date>
    <item>
      <title>parallel run in job pipeline</title>
      <link>https://community.databricks.com/t5/machine-learning/parallel-run-in-job-pipeline/m-p/16967#M907</link>
      <description>&lt;P&gt;I am trying to build a  pipeline which deploys a ML model, and I want to build the pipeline in Workflow/jobs. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In task of prediction of the model, I have hundreds of groups of input features, I use a for loop to get one group of input features and do prediction each time. Those groups are independent and the sequence of running doesn't matter. I want to set a threshold like 10, and kick off several parallel runs, each run will do prediction of 10 groups of input features. (If there are 100 groups, then 10 parallel runs; if there are 175 groups, then 18 runs). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any method to make one take of a pipeline kick off several runs with different parameters and the number of runs is decided by input data size?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Dec 2022 01:30:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/parallel-run-in-job-pipeline/m-p/16967#M907</guid>
      <dc:creator>Geeya</dc:creator>
      <dc:date>2022-12-14T01:30:13Z</dc:date>
    </item>
  </channel>
</rss>

