I have process in Data factory, that loads CDC changes from sql server and then trigger notebook with merge to bronze and silver zone. Single notebook takes about 1 minute to run but when all 50 notebooks are fired at once the whole process takes 25 minutes.
There is not a lot of changes in sql tables. When notebooks run, cluster must scale up and it takes much more time to finish.
Is it really a big deal for cluster to run 50 notebooks in parallel?
cluster config: 12.2 LTS access mode shared
Photon enabled
worker: 2-8 standard DS3 v2
driver: standard DS3 v2
here is screenshot from ganglia - load starts at 0600