<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I scale my neural network with spark? I'm building a fully connected tensorflow.keras model. in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-can-i-scale-my-neural-network-with-spark-i-m-building-a/m-p/25264#M17553</link>
    <description>&lt;P&gt;With Spark, there are a few ways you can scale your model: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Training&lt;/LI&gt;&lt;LI&gt;Hyperparameter tuning&lt;/LI&gt;&lt;LI&gt;Inference&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you're looking to train one model across multiple workers, you can leverage Horovod. It's an open source project designed to simplify distributed neural network training, and supports Keras/TF/PyTorch/MXNet. See the docs for &lt;A href="https://docs.databricks.com/applications/machine-learning/train-model/distributed-training/horovod-runner.html" alt="https://docs.databricks.com/applications/machine-learning/train-model/distributed-training/horovod-runner.html" target="_blank"&gt;HorovodRunner&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you're looking to train many candidate models in parallel, you can use &lt;A href="http://hyperopt.github.io/hyperopt/scaleout/spark/" alt="http://hyperopt.github.io/hyperopt/scaleout/spark/" target="_blank"&gt;HyperOpt with SparkTrials&lt;/A&gt;. Check out this fantastic &lt;A href="https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html" alt="https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html" target="_blank"&gt;blog&lt;/A&gt; on best practices on best practices and tips on setting parallelism for SparkTrials.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can always create a Spark UDF (super easy if you MLflow, e.g. &lt;A href="https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#mlflow.pyfunc.spark_udf" alt="https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#mlflow.pyfunc.spark_udf" target="_blank"&gt;mlflow.pyfunc.spark_udf&lt;/A&gt;) to trivially do inference in parallel for batch/streaming use cases.&lt;/P&gt;</description>
    <pubDate>Thu, 10 Jun 2021 18:35:04 GMT</pubDate>
    <dc:creator>User16788317454</dc:creator>
    <dc:date>2021-06-10T18:35:04Z</dc:date>
    <item>
      <title>How can I scale my neural network with spark? I'm building a fully connected tensorflow.keras model.</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-scale-my-neural-network-with-spark-i-m-building-a/m-p/25263#M17552</link>
      <description />
      <pubDate>Thu, 10 Jun 2021 17:59:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-scale-my-neural-network-with-spark-i-m-building-a/m-p/25263#M17552</guid>
      <dc:creator>j_weaver</dc:creator>
      <dc:date>2021-06-10T17:59:37Z</dc:date>
    </item>
    <item>
      <title>Re: How can I scale my neural network with spark? I'm building a fully connected tensorflow.keras model.</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-scale-my-neural-network-with-spark-i-m-building-a/m-p/25264#M17553</link>
      <description>&lt;P&gt;With Spark, there are a few ways you can scale your model: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Training&lt;/LI&gt;&lt;LI&gt;Hyperparameter tuning&lt;/LI&gt;&lt;LI&gt;Inference&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you're looking to train one model across multiple workers, you can leverage Horovod. It's an open source project designed to simplify distributed neural network training, and supports Keras/TF/PyTorch/MXNet. See the docs for &lt;A href="https://docs.databricks.com/applications/machine-learning/train-model/distributed-training/horovod-runner.html" alt="https://docs.databricks.com/applications/machine-learning/train-model/distributed-training/horovod-runner.html" target="_blank"&gt;HorovodRunner&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you're looking to train many candidate models in parallel, you can use &lt;A href="http://hyperopt.github.io/hyperopt/scaleout/spark/" alt="http://hyperopt.github.io/hyperopt/scaleout/spark/" target="_blank"&gt;HyperOpt with SparkTrials&lt;/A&gt;. Check out this fantastic &lt;A href="https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html" alt="https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html" target="_blank"&gt;blog&lt;/A&gt; on best practices on best practices and tips on setting parallelism for SparkTrials.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can always create a Spark UDF (super easy if you MLflow, e.g. &lt;A href="https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#mlflow.pyfunc.spark_udf" alt="https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#mlflow.pyfunc.spark_udf" target="_blank"&gt;mlflow.pyfunc.spark_udf&lt;/A&gt;) to trivially do inference in parallel for batch/streaming use cases.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Jun 2021 18:35:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-scale-my-neural-network-with-spark-i-m-building-a/m-p/25264#M17553</guid>
      <dc:creator>User16788317454</dc:creator>
      <dc:date>2021-06-10T18:35:04Z</dc:date>
    </item>
  </channel>
</rss>

