Alternatives for xg boost

Dk_1802
New Contributor III

Are there any alternatives for xg boost that work well with pyspark?

spark_ds
New Contributor III

XGboost is now an option in pyspark pipelines (link here) but PysparkML also supports a number of alternatives including

  1. Gradient-Boosted Trees (GBTs): Gradient-boosted trees is another boosting ensemble technique that learns from its mistakes in previous iterations. For this one can use the GBTClassifier from pyspark.ml.classification.
  2. Random Forest: Random Forest is a bagging ensemble learning method. For this one can use the RandomForestClassifier from pyspark.ml.classification.
  3. Decision Tree Classifier: The decision tree is a simple yet effective machine learning algorithm. For this one can use theDecisionTreeClassifier from pyspark.ml.classification.

 

View solution in original post

Vinay_M_R
Databricks Employee
Databricks Employee

Spark MLlib GBT: Spark MLlib, the machine learning library included with Apache Spark, provides its own implementation of gradient boosting trees (GBT). It offers similar functionality to XGBoost and can be used directly within PySpark ML pipelines without requiring external libraries.