cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Any recommendations to reduce the cluster spin up time?

Anonymous
Not applicable
 
1 ACCEPTED SOLUTION

Accepted Solutions

sajith_appukutt
Honored Contributor II

There are a few approaches to consider here.

There is an interesting session in the 2021 Data & AI summit on Nephos -which implements Lakehouse without the infrastructure management overheads. You could watch it here . Clusters in this platform would spin up significantly faster since databricks would be managing underlying VMS

For compute in customer's environment, there is a feature called Cluster pools which maintains a set of idle, ready-to-use VM instances

Optionally, if the cluster spin up time is caused by a large number of libraries getting installed during cluster startup time, take a look at Databricks container services

View solution in original post

2 REPLIES 2

sajith_appukutt
Honored Contributor II

There are a few approaches to consider here.

There is an interesting session in the 2021 Data & AI summit on Nephos -which implements Lakehouse without the infrastructure management overheads. You could watch it here . Clusters in this platform would spin up significantly faster since databricks would be managing underlying VMS

For compute in customer's environment, there is a feature called Cluster pools which maintains a set of idle, ready-to-use VM instances

Optionally, if the cluster spin up time is caused by a large number of libraries getting installed during cluster startup time, take a look at Databricks container services

Srikanth_Gupta_
Databricks Employee
Databricks Employee

Databricks pool concept may help reduce cluster start up time. more details are here