Machine Learning

Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.

Forum Posts

Sorted by:

Start a conversation

by kng88 • New Contributor II

11-28-2022 11:52:11 AM

5790 Views
6 replies
7 kudos

How to save model produce by distributed training?

I am trying to save model after distributed training via the following codeimport sys from spark_tensorflow_distributor import MirroredStrategyRunner import mlflow.keras mlflow.keras.autolog() mlflow.log_param("learning_rate", 0.001) import...

Machine Learning

5790 Views
6 replies
7 kudos

11-28-2022 11:52:11 AM

View Replies

Latest Reply

Xiaowei
New Contributor III

03-21-2024 6:50:55 AM

7 kudos

I think I finally worked this out.Here is the extra code to save out the model only once and from the 1st node:context = pyspark.BarrierTaskContext.get() if context.partitionId() == 0: mlflow.keras.log_model(model, "mymodel")

7 kudos

03-21-2024 6:50:55 AM

5 More Replies

Databricks Community

How to save model produce by distributed training?