cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Specifying cluster on running a job

Tjadi
New Contributor III

Hi,

Let's say that I am starting jobs with different parameters at a certain time each day in the following manner:

response = requests.post(
"https://%s/api/2.0/jobs/run-now" % (DOMAIN),
headers={"Authorization": "Bearer %s" % TOKEN}, json={
            "job_id": job_id,
            "notebook_params": {
                "country_name": str(country_id),
            }
        })
 

I was wondering how I could go about specifying a specific cluster size for a run of a workflow? And how do you specify that the cluster should be shared among the tasks in the workflow? This could be interesting when you have one country_id for which a bigger cluster is needed compared to all other countries and other similar use-cases.

Thanks in advance.

2 REPLIES 2

karthik_p
Esteemed Contributor

@Tjadi Peeters​ You can select option Autoscaling/Enhanced Scaling in workflows which will scale based on workload

Tjadi
New Contributor III

Thanks for your reply. The autoscaling the functionality I am aware of only scales the amount of workers - or is there another one? I am looking to start jobs with different types of workers (i.e. one of the jobs starts with a m5d.2xlarge while the other has m5d.4xlarge).

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now