topic Re: Can we parameterize the compute in job cluster in Data Engineering

Can we parameterize the compute in job cluster

NarenderKumar — Mon, 01 Jul 2024 10:32:05 GMT

I have created a workflow job in databricks with job parameters.

I want to run the job same with different workloads and data volume.

So I want the compute cluster to be parametrized so that I can pass the compute requirements(driver, executor size and number of nodes) dynamically when I run the job.

Is this possible in databricks?

Re: Can we parameterize the compute in job cluster

raphaelblg — Mon, 01 Jul 2024 17:26:45 GMT

Hi @NarenderKumar , If you want to change an existing job compute you would have to update the job settings before triggering a new run. Feel free to open a feature request with your idea through the Databricks Ideas Portal.

Re: Can we parameterize the compute in job cluster

brockb — Mon, 01 Jul 2024 17:30:22 GMT

Hi @NarenderKumar ,

Have you considered leveraging autoscaling for the existing cluster?

If this does not meet your needs, are the differing volume/workloads known in advance? If so, could different compute be provisioned using Infrastructure as Code based on the differing workload characteristics? Here's a doc on using Terraform with Databricks: https://docs.databricks.com/en/dev-tools/terraform/index.html

Thank you.