cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
missing-QuestionPost
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Automate Cluster creation

Vidisha
New Contributor II

I am new to databricks and my lead told me that we create clusters for running the notebook manually. So please write one python script to do this automation i.e to create the clusters automatically.

Can anyone help me to write the script using PySpark in Databricks. I have to use Azure Cloud Services for this.

3 REPLIES 3

-werners-
Esteemed Contributor III

I see 2 easy ways:

  1. use databricks workflows. this creates job clusters for you
  2. use data factory and the Databricks notebook activity: this does the same

Basically you tell databricks to create short-lived clusters which are alive during the execution of a spark program. When the program is finished, the cluster is terminated.

There is also something like cluster pools, which keeps instances warm.

Vidisha
New Contributor II

Actually we already have job clusters created. So now we need to automate existing manually created clusters which is used in running multiple notebooks by multiple users. Please suggest

-werners-
Esteemed Contributor III

So you created interactive clusters which you want to use to run notebooks in a scheduled way?

If so: you can schedule notebooks on an existing cluster (the schedule button) or in data factory use an existing cluster to run a notebook. So my answer stays pretty much the same.

If that is not what you are looking for, please explain your use case.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group