Scaling issue for inference with a spark.mllib model
Hello,I'm writing this because I have tried a lot of different directions to get a simple model inference working with no success.Here is the outline of the job# 1 - Load the base data (~1 billion lines of ~6 columns) interaction = build_initial_df()...
- 8964 Views
- 4 replies
- 7 kudos
Latest Reply
It is hard to analyze without Spark UI and more detailed information, but anyway few tips:look for data skews some partitions can be very big some small because of incorrect partitioning. You can use Spark UI to do that but also debug your code a bit...
- 7 kudos