I hear about trying to improve starting time at conferences for two years, so it is something like a never-ending story. Pools and serverless pools will offer further improvements. Recommended instance types are also usually better, as databricks is working with vendors on that. Additionally, I heard that GCC is now the fastest to start vms/cluster. For me, big improvements with deployment time would be that pools would have preinstalled libraries (instead of setting them on cluster level).


My blog: https://databrickster.medium.com/

View solution in original post