Most of the optimisations can be done while selecting the number of partitions we can to create for data, too many would cause a large shuffle operation on wide dependency operations and too less would cause less parallelisation. To minimise the tim...