Databricks Community

Joseph_B · ‎06-09-2021

I want to know how to use Hyperopt in different situations:

Tuning a single-machine algorithm from scikit-learn or single-node TensorFlow
Tuning a distributed algorithm from Spark ML or distributed TensorFlow / Horovod

Joseph_B · ‎06-09-2021

The right question to ask is indeed: Is the algorithm you want to tune single-machine or distributed?

If it's a single-machine algorithm like any from scikit-learn, then you can use SparkTrials with Hyperopt to distribute hyperparameter tuning.

If it's a distributed algorithm like any from Spark ML, then you should not use SparkTrials. You can run Hyperopt without a `trials` parameter (i.e., use the regular `Trials` type). That will run tuning on the cluster driver, leaving the full cluster available for each trial of your distributed algorithm.

You can find more info on these in the docs (AWS, Azure, GCP).

Databricks Community

When doing hyperparameter tuning with Hyperopt, when should I use SparkTrials? Does it work with both single-machine ML (like sklearn) and distributed ML (like Apache Spark ML)?

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! October 31 – November 06, 2025

Free Edition Hackathon

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Zerobus Ingest in Action: How to Stream Event Data Directly into Your Lakehouse

Find Sensitive Data at Scale with Data Classification in Unity Catalog