Hubert-Dudek
Databricks MVP

Not sure what this code does, but spark executes job by job, so ThreadPoolExecutor doesn't make much sense. If you want to execute notebooks in parallel, please run them as separate jobs with a fair scheduler (so you reserve resources for each notebook - in first line sc.setLocalProperty("spark.scheduler.pool", "somename") when somename is unique for your parallel notebook execution)


My blog: https://databrickster.medium.com/

View solution in original post