cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Performance issue: Running 50 notebooks from ADF

alesventus
Contributor

I have process in Data factory, that loads CDC changes from sql server and then trigger notebook with merge to bronze and silver zone. Single notebook takes about 1 minute to run but when all 50 notebooks are fired at once the whole process takes 25 minutes. 

There is not a lot of changes in sql tables. When notebooks run, cluster must scale up and it takes much more time to finish.

Is it really a big deal for cluster to run 50 notebooks in parallel?

cluster config: 12.2 LTS access mode shared

Photon enabled

worker: 2-8 standard DS3 v2

driver: standard DS3 v2

here is screenshot from ganglia - load starts at 0600

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group