Databricks Community

Anonymous · ‎06-17-2021

sajith_appukutt · ‎06-17-2021

There are a few approaches to consider here.

There is an interesting session in the 2021 Data & AI summit on Nephos -which implements Lakehouse without the infrastructure management overheads. You could watch it here . Clusters in this platform would spin up significantly faster since databricks would be managing underlying VMS

For compute in customer's environment, there is a feature called Cluster pools which maintains a set of idle, ready-to-use VM instances

Optionally, if the cluster spin up time is caused by a large number of libraries getting installed during cluster startup time, take a look at Databricks container services

View solution in original post

sajith_appukutt · ‎06-17-2021

There are a few approaches to consider here.

There is an interesting session in the 2021 Data & AI summit on Nephos -which implements Lakehouse without the infrastructure management overheads. You could watch it here . Clusters in this platform would spin up significantly faster since databricks would be managing underlying VMS

For compute in customer's environment, there is a feature called Cluster pools which maintains a set of idle, ready-to-use VM instances

Optionally, if the cluster spin up time is caused by a large number of libraries getting installed during cluster startup time, take a look at Databricks container services