08-30-2022 09:25 PM
What is the maximum of concurrent streaming jobs for a cluster? How can I have the right amount of concurrent streaming jobs for different cluster configuration?
Should I use multiple cluster for different jobs or combine it into a big cluster to handle all the jobs?
09-05-2022 01:39 AM
I understand, but you can calculate the risk involved in using a single cluster for all your streaming jobs. Let's say you are running 4 streaming jobs in a cluster and because of 1 job the cluster gets into a hung state or something went wrong on the cluster, then all 4 jobs will be affected. However, if you use separate clusters for each streaming job, then in the event of such problems only one job will be affected and others will be running properly. This is my thought. You need to decide all factors and plan the clusters. Also you can calculate the pricing for one cluster and multiple clusters.
Let's say for 4 streaming jobs, I use a single cluster of i3.4xlarge instance with 10 workers of the same type, I use 44 DBU/hr
& if I use 1 cluster per job, so I can use 4 smaller clusters each i3.xlarge instance with 10 workers will also cost me 44 DBU/hour (11 DBU/hr per cluster).
This way you can calculate the workload and the pricing and decide on the cluster sizing.
09-01-2022 05:22 AM
Hi @John William it would be better to use different clusters for each streaming jobs.
09-04-2022 09:20 PM
I worried about the cost of this approach, spin up new cluster for every streaming job running non stop required a lot of resources.
09-05-2022 01:39 AM
I understand, but you can calculate the risk involved in using a single cluster for all your streaming jobs. Let's say you are running 4 streaming jobs in a cluster and because of 1 job the cluster gets into a hung state or something went wrong on the cluster, then all 4 jobs will be affected. However, if you use separate clusters for each streaming job, then in the event of such problems only one job will be affected and others will be running properly. This is my thought. You need to decide all factors and plan the clusters. Also you can calculate the pricing for one cluster and multiple clusters.
Let's say for 4 streaming jobs, I use a single cluster of i3.4xlarge instance with 10 workers of the same type, I use 44 DBU/hr
& if I use 1 cluster per job, so I can use 4 smaller clusters each i3.xlarge instance with 10 workers will also cost me 44 DBU/hour (11 DBU/hr per cluster).
This way you can calculate the workload and the pricing and decide on the cluster sizing.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group