<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Proper mlflow run logging with SparkTrials and Hyperopt in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/proper-mlflow-run-logging-with-sparktrials-and-hyperopt/m-p/109656#M3960</link>
    <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;I'm attempting to run a hyperparameter search using hyperopt and SparkTrials(), and log the resulting runs to an existing experiment (experiment A). I can see on &lt;A href="https://docs.databricks.com/en/machine-learning/automl-hyperparam-tuning/hyperopt-concepts.html" target="_self"&gt;this page&lt;/A&gt;&amp;nbsp;that databricks suggests wrapping the `fmin()` call within a `&lt;SPAN&gt;mlflow.start_run()` statement, so I wrote this code in a notebook, which by default is associated with its own experiment (experiment B):&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;with mlflow.start_run(experiment_id=experiment_A):
    trials = SparkTrials()
    fmin(objective_fn,
        trials=trials,
        **other_params)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, the result I get is a single parent run logged in experiment_A with all its child runs logged in experiment_B. This is obviously not desirable. Is there any way I can get both the parent and child runs from the trials logged in experiment A (i.e., not the experiment associated with the notebook being run)?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 10 Feb 2025 16:49:06 GMT</pubDate>
    <dc:creator>rtreves</dc:creator>
    <dc:date>2025-02-10T16:49:06Z</dc:date>
    <item>
      <title>Proper mlflow run logging with SparkTrials and Hyperopt</title>
      <link>https://community.databricks.com/t5/machine-learning/proper-mlflow-run-logging-with-sparktrials-and-hyperopt/m-p/109656#M3960</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;I'm attempting to run a hyperparameter search using hyperopt and SparkTrials(), and log the resulting runs to an existing experiment (experiment A). I can see on &lt;A href="https://docs.databricks.com/en/machine-learning/automl-hyperparam-tuning/hyperopt-concepts.html" target="_self"&gt;this page&lt;/A&gt;&amp;nbsp;that databricks suggests wrapping the `fmin()` call within a `&lt;SPAN&gt;mlflow.start_run()` statement, so I wrote this code in a notebook, which by default is associated with its own experiment (experiment B):&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;with mlflow.start_run(experiment_id=experiment_A):
    trials = SparkTrials()
    fmin(objective_fn,
        trials=trials,
        **other_params)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, the result I get is a single parent run logged in experiment_A with all its child runs logged in experiment_B. This is obviously not desirable. Is there any way I can get both the parent and child runs from the trials logged in experiment A (i.e., not the experiment associated with the notebook being run)?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2025 16:49:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/proper-mlflow-run-logging-with-sparktrials-and-hyperopt/m-p/109656#M3960</guid>
      <dc:creator>rtreves</dc:creator>
      <dc:date>2025-02-10T16:49:06Z</dc:date>
    </item>
    <item>
      <title>Re: Proper mlflow run logging with SparkTrials and Hyperopt</title>
      <link>https://community.databricks.com/t5/machine-learning/proper-mlflow-run-logging-with-sparktrials-and-hyperopt/m-p/137949#M4412</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Both the parent and child runs of a Hyperopt sweep in Databricks are, by default, influenced by the experiment associated with the notebook context rather than the explicit experiment passed to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mlflow.start_run()&lt;/CODE&gt;. As you noticed, child runs remain in the notebook’s experiment (experiment B), even when a parent run is created in your chosen experiment (experiment A) .&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Why This Happens&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;experiment_id&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;you provide to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mlflow.start_run()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;sets the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;parent run&lt;/STRONG&gt;'s experiment location.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;However, with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;SparkTrials()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(which parallelizes runs across separate Spark workers), each child run launched by workers uses their local notebook context, defaulting to that experiment's setting—typically&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;experiment B&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Workarounds and Solutions&lt;/H2&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;1. Set the Experiment Globally Before Running&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;must call&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mlflow.set_experiment()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(or&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mlflow.set_experiment_by_name()&lt;/CODE&gt;) BEFORE invoking&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;fmin()&lt;/CODE&gt;. This changes the default experiment for the entire context, ensuring both parent and child runs go to the same experiment:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[100px]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-[3px] font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;mlflow&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;set_experiment&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"/Users/your-user/experiment_A_name"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;with&lt;/SPAN&gt; mlflow&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;start_run&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
    trials &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; SparkTrials&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
    fmin&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;objective_fn&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; trials&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;trials&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token operator"&gt;**&lt;/SPAN&gt;other_params&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Do&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;not&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;use the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;experiment_id&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;argument in&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;start_run()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;if also using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;set_experiment()&lt;/CODE&gt;; set_experiment takes precedence for all subsequent runs until you change it again .&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;2. If Using Experiment ID Explicitly&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If you want to use IDs:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[100px]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-[3px] font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;mlflow&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;set_experiment&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;experiment_id&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;experiment_A_id&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;with&lt;/SPAN&gt; mlflow&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;start_run&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;# as above&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The key is to invoke&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;set_experiment()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;before your&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;fmin()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;call—not just within the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;start_run()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;context—so the worker processes pick up the correct experiment.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;3. Manual Consolidation (Not Preferred)&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If the above isn’t possible due to distributed nuances or codebase constraints, you can:&lt;/P&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Use MLflow APIs to move/copy runs between experiments after the fact. This is more work and not recommended unless necessary .&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Important Tips&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Always set the experiment at the notebook (or job) level&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;before&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;starting your trials.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The parent run’s context does not “trickle down” to the child runs automatically with SparkTrials.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If using Databricks jobs, ensure the MLflow experiment context is set at the job level, too.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Summary:&lt;/STRONG&gt;&lt;BR /&gt;To log all runs to experiment A, call&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mlflow.set_experiment()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for experiment A at the start of your notebook or script, before any runs are started or before calling&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;fmin()&lt;/CODE&gt;. Do&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;not&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;rely on the parent run context alone to set experiment affiliation for children when using SparkTrials .&lt;/P&gt;</description>
      <pubDate>Thu, 06 Nov 2025 11:54:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/proper-mlflow-run-logging-with-sparktrials-and-hyperopt/m-p/137949#M4412</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-06T11:54:54Z</dc:date>
    </item>
  </channel>
</rss>

