Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2021 03:39 PM
I have set numBuckets and numBucketsArray for a group of columns to bin them into 5 buckets.
Unfortunately the number of buckets does not seem to be respected across all columns even though there is variation within them.
I have tried setting the relativeerror to 0.
Any idea why this is and how to solve it to force the number of buckets specified?
Labels:
- Labels:
-
Pyspark