I will highly recommend to run your job with the default values. Then you can have a good reference point in case you would like to optimize further. Check your cluster utilization and Spark UI. This will help you to undertand better what is happening as your job is running