<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: When doing hyperparameter tuning with Hyperopt, when should I use SparkTrials?  Does it work with both single-machine ML (like sklearn) and distributed ML (like Apache Spark ML)? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/when-doing-hyperparameter-tuning-with-hyperopt-when-should-i-use/m-p/25352#M17623</link>
    <description>&lt;P&gt;The right question to ask is indeed: Is the algorithm you want to tune single-machine or distributed?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If it's a single-machine algorithm like any from scikit-learn, then you can use SparkTrials with Hyperopt to distribute hyperparameter tuning.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If it's a distributed algorithm like any from Spark ML, then you should not use SparkTrials.  You can run Hyperopt without a `trials` parameter (i.e., use the regular `Trials` type).  That will run tuning on the cluster driver, leaving the full cluster available for each trial of your distributed algorithm.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can find more info on these in the docs (&lt;A href="https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" alt="https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" target="_blank"&gt;AWS&lt;/A&gt;, &lt;A href="https://docs.microsoft.com/en-us/azure/databricks/applications/machine-learning/automl-hyperparam-tuning/#--hyperparameter-tuning-with-hyperopt" alt="https://docs.microsoft.com/en-us/azure/databricks/applications/machine-learning/automl-hyperparam-tuning/#--hyperparameter-tuning-with-hyperopt" target="_blank"&gt;Azure&lt;/A&gt;, &lt;A href="https://docs.gcp.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" alt="https://docs.gcp.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" target="_blank"&gt;GCP&lt;/A&gt;).&lt;/P&gt;</description>
    <pubDate>Thu, 10 Jun 2021 00:56:20 GMT</pubDate>
    <dc:creator>Joseph_B</dc:creator>
    <dc:date>2021-06-10T00:56:20Z</dc:date>
    <item>
      <title>When doing hyperparameter tuning with Hyperopt, when should I use SparkTrials?  Does it work with both single-machine ML (like sklearn) and distributed ML (like Apache Spark ML)?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-doing-hyperparameter-tuning-with-hyperopt-when-should-i-use/m-p/25351#M17622</link>
      <description>&lt;P&gt;I want to know how to use Hyperopt in different situations:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Tuning a single-machine algorithm from scikit-learn or single-node TensorFlow&lt;/LI&gt;&lt;LI&gt;Tuning a distributed algorithm from Spark ML or distributed TensorFlow / Horovod&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Thu, 10 Jun 2021 00:51:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-doing-hyperparameter-tuning-with-hyperopt-when-should-i-use/m-p/25351#M17622</guid>
      <dc:creator>Joseph_B</dc:creator>
      <dc:date>2021-06-10T00:51:24Z</dc:date>
    </item>
    <item>
      <title>Re: When doing hyperparameter tuning with Hyperopt, when should I use SparkTrials?  Does it work with both single-machine ML (like sklearn) and distributed ML (like Apache Spark ML)?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-doing-hyperparameter-tuning-with-hyperopt-when-should-i-use/m-p/25352#M17623</link>
      <description>&lt;P&gt;The right question to ask is indeed: Is the algorithm you want to tune single-machine or distributed?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If it's a single-machine algorithm like any from scikit-learn, then you can use SparkTrials with Hyperopt to distribute hyperparameter tuning.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If it's a distributed algorithm like any from Spark ML, then you should not use SparkTrials.  You can run Hyperopt without a `trials` parameter (i.e., use the regular `Trials` type).  That will run tuning on the cluster driver, leaving the full cluster available for each trial of your distributed algorithm.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can find more info on these in the docs (&lt;A href="https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" alt="https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" target="_blank"&gt;AWS&lt;/A&gt;, &lt;A href="https://docs.microsoft.com/en-us/azure/databricks/applications/machine-learning/automl-hyperparam-tuning/#--hyperparameter-tuning-with-hyperopt" alt="https://docs.microsoft.com/en-us/azure/databricks/applications/machine-learning/automl-hyperparam-tuning/#--hyperparameter-tuning-with-hyperopt" target="_blank"&gt;Azure&lt;/A&gt;, &lt;A href="https://docs.gcp.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" alt="https://docs.gcp.databricks.com/applications/machine-learning/automl-hyperparam-tuning/index.html#hyperparameter-tuning-with-hyperopt" target="_blank"&gt;GCP&lt;/A&gt;).&lt;/P&gt;</description>
      <pubDate>Thu, 10 Jun 2021 00:56:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-doing-hyperparameter-tuning-with-hyperopt-when-should-i-use/m-p/25352#M17623</guid>
      <dc:creator>Joseph_B</dc:creator>
      <dc:date>2021-06-10T00:56:20Z</dc:date>
    </item>
  </channel>
</rss>

