Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2023 03:51 PM
Are there any alternatives for xg boost that work well with pyspark?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2023 04:02 PM
XGboost is now an option in pyspark pipelines (link here) but PysparkML also supports a number of alternatives including
- Gradient-Boosted Trees (GBTs): Gradient-boosted trees is another boosting ensemble technique that learns from its mistakes in previous iterations. For this one can use the GBTClassifier from pyspark.ml.classification.
- Random Forest: Random Forest is a bagging ensemble learning method. For this one can use the RandomForestClassifier from pyspark.ml.classification.
- Decision Tree Classifier: The decision tree is a simple yet effective machine learning algorithm. For this one can use theDecisionTreeClassifier from pyspark.ml.classification.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-30-2023 03:32 AM
Spark MLlib GBT: Spark MLlib, the machine learning library included with Apache Spark, provides its own implementation of gradient boosting trees (GBT). It offers similar functionality to XGBoost and can be used directly within PySpark ML pipelines without requiring external libraries.