cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can I use cluster autoscaling with intensive subprocess calls?

KellenO
New Contributor II

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subprocess.Popen(). However, doing it this way doesn't take advantage of autoscaling.

As a quick code example of what I'm trying to do:

ListOfCustomArguments = ["/path/to/config1.txt", "/path/to/config2.txt"] # Hundreds of custom configurations here
 
processes = []
for arg in ListOfCustomArguments :
   command = "/path/to/executable " + arg
   processes.append(subprocess.Popen(command, shell=True))
 
for p in processes:
   p.wait()
 
print("Done!")

As is, this will not auto-scale. Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.

https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

View solution in original post

2 REPLIES 2

Anonymous
Not applicable

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.

https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

Nice response @Joseph Kambourakisโ€‹ 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group