Databricks Community

j_weaver · ‎06-10-2021

User16788317454 · ‎06-10-2021

With Spark, there are a few ways you can scale your model:

Training
Hyperparameter tuning
Inference

If you're looking to train one model across multiple workers, you can leverage Horovod. It's an open source project designed to simplify distributed neural network training, and supports Keras/TF/PyTorch/MXNet. See the docs for HorovodRunner.

If you're looking to train many candidate models in parallel, you can use HyperOpt with SparkTrials. Check out this fantastic blog on best practices on best practices and tips on setting parallelism for SparkTrials.

You can always create a Spark UDF (super easy if you MLflow, e.g. mlflow.pyfunc.spark_udf) to trivially do inference in parallel for batch/streaming use cases.

View solution in original post

User16788317454 · ‎06-10-2021

With Spark, there are a few ways you can scale your model:

Training
Hyperparameter tuning
Inference

If you're looking to train one model across multiple workers, you can leverage Horovod. It's an open source project designed to simplify distributed neural network training, and supports Keras/TF/PyTorch/MXNet. See the docs for HorovodRunner.

If you're looking to train many candidate models in parallel, you can use HyperOpt with SparkTrials. Check out this fantastic blog on best practices on best practices and tips on setting parallelism for SparkTrials.

You can always create a Spark UDF (super easy if you MLflow, e.g. mlflow.pyfunc.spark_udf) to trivially do inference in parallel for batch/streaming use cases.

Databricks Community

How can I scale my neural network with spark? I'm building a fully connected tensorflow.keras model.

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Share Your Feedback in Our Community Survey

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks