Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-16-2024 07:43 AM
@mohaimen_syed - There are many reasons why only 2 nodes are used at the most.
1. sklearn implementation of randomforest classifier is not distributed. Please use pyspark.ml implementation
2. your dataframe may be small enough.
Always start with a small number of nodes and modify the number of nodes based on your workload.