Machine Learning

by Joseph_B • Databricks Employee

12-20-2021 8:43:47 AM

1816 Views
1 replies
0 kudos

For tuning hyperparameters with Apache Spark ML / MLlib, when should I use Spark ML's built-in tuning algorithms vs. Hyperopt?

When should I use Spark ML's CrossValidator or TrainValidationSplit, vs. a separate tuning tool such as Hyperopt?

Machine Learning

1816 Views
1 replies
0 kudos

12-20-2021 8:43:47 AM

View Replies

Latest Reply

Joseph_B
Databricks Employee

12-20-2021 8:51:13 AM

0 kudos

Both are valid choices. By default, I'd recommend using Hyperopt nowadays. Here's the rationale, as pros & cons of each.Spark ML's built-in toolsPros: These fit the Spark ML Pipeline framework, so you can keep using the same type of APIs.Cons: Thes...

0 kudos

12-20-2021 8:51:13 AM

by User16826992666 • Valued Contributor

06-17-2021 8:05:21 AM

2617 Views
1 replies
0 kudos

Resolved! In Spark MLlib, what is the difference between an estimator and a transformer?

Machine Learning

2617 Views
1 replies
0 kudos

06-17-2021 8:05:21 AM

View Replies

Latest Reply

sean_owen
Databricks Employee

06-17-2021 11:21:49 AM

0 kudos

These terms are borrowed from scikit-learn, and the idea is the same. A transformer is just a component of a pipeline that transforms the data in some way. An estimator is also a transfomer, but one that additionally needs to be 'fit' on data before ...

0 kudos

06-17-2021 11:21:49 AM

Databricks Community

Forum Posts

For tuning hyperparameters with Apache Spark ML / MLlib, when should I use Spark ML's built-in tuning algorithms vs. Hyperopt?

Resolved! In Spark MLlib, what is the difference between an estimator and a transformer?