Performance issue: Running 50 notebooks from ADF
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-03-2023 06:28 AM
I have process in Data factory, that loads CDC changes from sql server and then trigger notebook with merge to bronze and silver zone. Single notebook takes about 1 minute to run but when all 50 notebooks are fired at once the whole process takes 25 minutes.
There is not a lot of changes in sql tables. When notebooks run, cluster must scale up and it takes much more time to finish.
Is it really a big deal for cluster to run 50 notebooks in parallel?
cluster config: 12.2 LTS access mode shared
Photon enabled
worker: 2-8 standard DS3 v2
driver: standard DS3 v2
here is screenshot from ganglia - load starts at 0600
Labels:
- Labels:
-
Spark
0 REPLIES 0

