Performance Tuning of Databricks Notebook
Hi Everyone ,I am trying to run a databricks notebook in parallel using ThreadPoolExecutor .Can anyone suggest how to reduce the time taken based on the below findings so far.Current Performance:Time taken - 25 minutes ThreadPoolExecutor max_workers ...
- 6722 Views
- 3 replies
- 4 kudos
Latest Reply
ThreadPoolExecutor will not help as Databricks/Spark will process job by job.So please analyze in Spark UI what is consuming the most time.There are a lot of tips on how to optimize they depend on the dataset (size etc. transformations)Look for data ...
- 4 kudos