cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Alternatives for xg boost

Dk_1802
New Contributor III

Are there any alternatives for xg boost that work well with pyspark?

1 ACCEPTED SOLUTION

Accepted Solutions

spark_ds
New Contributor III

XGboost is now an option in pyspark pipelines (link here) but PysparkML also supports a number of alternatives including

  1. Gradient-Boosted Trees (GBTs): Gradient-boosted trees is another boosting ensemble technique that learns from its mistakes in previous iterations. For this one can use the GBTClassifier from pyspark.ml.classification.
  2. Random Forest: Random Forest is a bagging ensemble learning method. For this one can use the RandomForestClassifier from pyspark.ml.classification.
  3. Decision Tree Classifier: The decision tree is a simple yet effective machine learning algorithm. For this one can use theDecisionTreeClassifier from pyspark.ml.classification.

 

View solution in original post

2 REPLIES 2

spark_ds
New Contributor III

XGboost is now an option in pyspark pipelines (link here) but PysparkML also supports a number of alternatives including

  1. Gradient-Boosted Trees (GBTs): Gradient-boosted trees is another boosting ensemble technique that learns from its mistakes in previous iterations. For this one can use the GBTClassifier from pyspark.ml.classification.
  2. Random Forest: Random Forest is a bagging ensemble learning method. For this one can use the RandomForestClassifier from pyspark.ml.classification.
  3. Decision Tree Classifier: The decision tree is a simple yet effective machine learning algorithm. For this one can use theDecisionTreeClassifier from pyspark.ml.classification.

 

Vinay_M_R
Databricks Employee
Databricks Employee

Spark MLlib GBT: Spark MLlib, the machine learning library included with Apache Spark, provides its own implementation of gradient boosting trees (GBT). It offers similar functionality to XGBoost and can be used directly within PySpark ML pipelines without requiring external libraries.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group