Automate Cluster creation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2023 03:41 AM
I am new to databricks and my lead told me that we create clusters for running the notebook manually. So please write one python script to do this automation i.e to create the clusters automatically.
Can anyone help me to write the script using PySpark in Databricks. I have to use Azure Cloud Services for this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2023 05:59 AM
I see 2 easy ways:
- use databricks workflows. this creates job clusters for you
- use data factory and the Databricks notebook activity: this does the same
Basically you tell databricks to create short-lived clusters which are alive during the execution of a spark program. When the program is finished, the cluster is terminated.
There is also something like cluster pools, which keeps instances warm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2023 01:29 PM
Actually we already have job clusters created. So now we need to automate existing manually created clusters which is used in running multiple notebooks by multiple users. Please suggest
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-15-2023 12:14 AM
So you created interactive clusters which you want to use to run notebooks in a scheduled way?
If so: you can schedule notebooks on an existing cluster (the schedule button) or in data factory use an existing cluster to run a notebook. So my answer stays pretty much the same.
If that is not what you are looking for, please explain your use case.
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)