cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

In Spark MLlib, what is the difference between an estimator and a transformer?

User16826992666
Valued Contributor
 
1 ACCEPTED SOLUTION

Accepted Solutions

sean_owen
Honored Contributor II
Honored Contributor II

These terms are borrowed from scikit-learn, and the idea is the same. A transformer is just a component of a pipeline that transforms the data in some way. An estimator is also a transfomer, but one that additionally needs to be 'fit' on data before it knows how to transform.

For example, a StringTokenizer is just a transformer, because it does not need to see any data to know what to do, to tokenize strings. A machine learning model like LogisticRegression is also a transformer, because it transforms data by adding a prediction. However it has to be fit on data first before it can do so. So it is (also) an estimator.

View solution in original post

1 REPLY 1

sean_owen
Honored Contributor II
Honored Contributor II

These terms are borrowed from scikit-learn, and the idea is the same. A transformer is just a component of a pipeline that transforms the data in some way. An estimator is also a transfomer, but one that additionally needs to be 'fit' on data before it knows how to transform.

For example, a StringTokenizer is just a transformer, because it does not need to see any data to know what to do, to tokenize strings. A machine learning model like LogisticRegression is also a transformer, because it transforms data by adding a prediction. However it has to be fit on data first before it can do so. So it is (also) an estimator.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.