โ08-30-2022 09:25 PM
What is the maximum of concurrent streaming jobs for a cluster? How can I have the right amount of concurrent streaming jobs for different cluster configuration?
Should I use multiple cluster for different jobs or combine it into a big cluster to handle all the jobs?
โ09-05-2022 01:39 AM
I understand, but you can calculate the risk involved in using a single cluster for all your streaming jobs. Let's say you are running 4 streaming jobs in a cluster and because of 1 job the cluster gets into a hung state or something went wrong on the cluster, then all 4 jobs will be affected. However, if you use separate clusters for each streaming job, then in the event of such problems only one job will be affected and others will be running properly. This is my thought. You need to decide all factors and plan the clusters. Also you can calculate the pricing for one cluster and multiple clusters.
Let's say for 4 streaming jobs, I use a single cluster of i3.4xlarge instance with 10 workers of the same type, I use 44 DBU/hr
& if I use 1 cluster per job, so I can use 4 smaller clusters each i3.xlarge instance with 10 workers will also cost me 44 DBU/hour (11 DBU/hr per cluster).
This way you can calculate the workload and the pricing and decide on the cluster sizing.
โ09-01-2022 05:22 AM
Hi @John Williamโ it would be better to use different clusters for each streaming jobs.
โ09-04-2022 09:20 PM
I worried about the cost of this approach, spin up new cluster for every streaming job running non stop required a lot of resources.
โ09-05-2022 01:39 AM
I understand, but you can calculate the risk involved in using a single cluster for all your streaming jobs. Let's say you are running 4 streaming jobs in a cluster and because of 1 job the cluster gets into a hung state or something went wrong on the cluster, then all 4 jobs will be affected. However, if you use separate clusters for each streaming job, then in the event of such problems only one job will be affected and others will be running properly. This is my thought. You need to decide all factors and plan the clusters. Also you can calculate the pricing for one cluster and multiple clusters.
Let's say for 4 streaming jobs, I use a single cluster of i3.4xlarge instance with 10 workers of the same type, I use 44 DBU/hr
& if I use 1 cluster per job, so I can use 4 smaller clusters each i3.xlarge instance with 10 workers will also cost me 44 DBU/hour (11 DBU/hr per cluster).
This way you can calculate the workload and the pricing and decide on the cluster sizing.
โ09-03-2022 01:48 PM
Hi @John Williamโ , We haven't heard from you on the last response from @Prabakarโ , and I was checking back to see if his suggestions helped you.
Or else, If you have any solution, please share it with the community as it can be helpful to others.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group