05-02-2022 02:08 AM
I have a notebook functioning as a pipeline, where multiple notebooks are chained together.
The issue I'm facing is that some of the notebooks are spark-optimized, others aren't, and what I want is to use 1 cluster for the former and another for the latter. However, this would mean changing clusters halfway through the pipeline notebook. Is that possible? And if so, how?
05-02-2022 02:13 PM
Yes, you can achieve this by setting two different job clusters. In the screenshot, you can see I have used 2 job clusters PipelineTest and pipelinetest2. You can refer the doc https://docs.databricks.com/jobs.html#cluster-config-tips
05-02-2022 02:11 AM
In such a case, orchestrating those jobs using Azure Data Factory is highly recommended.
05-02-2022 02:13 PM
Yes, you can achieve this by setting two different job clusters. In the screenshot, you can see I have used 2 job clusters PipelineTest and pipelinetest2. You can refer the doc https://docs.databricks.com/jobs.html#cluster-config-tips
05-12-2022 05:15 AM
Hi @Niels Ota, Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer) and @Prabakar Ammeappin's response help you to find the solution? Please let us know.
06-14-2022 08:40 AM
Hi @Niels Ota , We haven’t heard from you on the last response from @Prabakar Ammeappin , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
07-26-2022 01:28 AM
Hi Kaniz, sorry for the incredibly late reply. My notifications for responses ended up in my spam folder!
I ended up using ADF, but tried @Prabakar Ammeappin 's solution and that worked too!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group