Hello
We have a workflow in our team to perform usual monthly tasks to be ran on the first working day of the month.
Each of the ~20 users will run a clone of this workflow most likely all around the same time but with different options. Because we don't have access to Job-Compute, it runs on a few All-Purpose Computes shared across users.
The first step of this workflow consists in downloading data using a SOAP API (wrapped in a R Package). Since two months, we observed a significant degradation in performance of this task, going from ~5min to ~10 min, if it ever finishes.
It feels like the network now can't handle the possibly concurrent calls to the API. Restarting a cluster and organizing the users in a queue solves the issue but is far from being optimal.
Any recommendations for improvements here ?
Thanks