cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Mr__D
by New Contributor II
  • 6360 Views
  • 1 replies
  • 0 kudos

Databricks Cluster Autoscaling

Hello All,Could anyone please suggest impact of Autoscaling in cluster cost ?Suppose if I have a cluster where min worker is 2 and max is 10 but most of the time active worker are 3 so the cluster will be billed for only 3 workers or for 10 worker(...

  • 6360 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Deepak Bhatt​ :Autoscaling in Databricks can have a significant impact on cluster cost, as it allows the cluster to dynamically add or remove workers based on the workload.In the scenario you described, if the active worker count is consistently at ...

  • 0 kudos
KellenO
by New Contributor II
  • 2077 Views
  • 2 replies
  • 8 kudos

Resolved! How can I use cluster autoscaling with intensive subprocess calls?

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subpr...

  • 2077 Views
  • 2 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

  • 8 kudos
1 More Replies
dataslicer
by Contributor
  • 5030 Views
  • 7 replies
  • 2 kudos

Resolved! Exploring additional cost saving options for structured streaming 24x7x365 uptime workloads

I currently have multiple jobs (each running its own job cluster) for my spark structured streaming pipelines that are long running 24x7x365 on DBR 9.x/10.x LTS. My SLAs are 24x7x365 with 1 minute latency. I have already accomplished the following co...

  • 5030 Views
  • 7 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

http://doramasmp4.tv/

  • 2 kudos
6 More Replies
User16826992666
by Valued Contributor
  • 2118 Views
  • 1 replies
  • 0 kudos

Resolved! How does cluster autoscaling work?

What determines when the cluster autoscaling activates to add and remove workers? Also, can it be adjusted?

  • 2118 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

> What determines when the cluster autoscaling activates to add and remove workersDuring scale-down, the service removes a worker only if it is idle and does not contain any shuffle data. This allows aggressive resizing without killing tasks or recom...

  • 0 kudos
Labels