cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Clusters are really slow

valjas
New Contributor III

We have two environments for our Azure Databricks. Dev and Prod. We had clusters created and tested in Dev environment, then they were exported to the prod environment through APIs. The clusters in Dev are performing as expected. Whereas, the clusters in Prod are taking a long time for simple select queries. I checked and verified that the clusters have same configuration in both Dev and Prod. What could be causing the issue of slow performance in Prod environment?

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @valjas , Prod likely deals with a larger volume of data. Check the size and distribution of the data in Prod. It might expose bottlenecks in storage or processing that weren't apparent in Dev.

valjas
New Contributor III

Both Prod and Dev are connect to unity catalog and I am working with the same table in both the envs. Can something done during the creation of workspace itself affect the performance of clusters? Do clusters update to latest Databricks runtime version automatically? Cause, when I created the clusters I used version 13.2 but the runtime version now is 14.0. I switched it back to 13.2 and performance seems a bit better.

Kaniz_Fatma
Community Manager
Community Manager

Hi @valjas, Workspace Creation and Cluster Performance:

  • Actions taken during the creation of a workspace can indeed impact cluster performance. When setting up a workspace, consider the following factors:
    • Configuration Settings: Ensure that the workspace configuration aligns with your requirements. Properly configure the cluster size, instance types, and other relevant settings.
    • Library Dependencies: If your workspace relies on specific libraries or packages, ensure they are correctly installed. Incorrect or missing dependencies can affect cluster performance.
    • Workspace Resources: The workspace itself consumes resources. If it’s resource-intensive, it might compete with clusters for available resources.
    • Security and Access Controls: Properly set up security policies and access controls to prevent unauthorized access or resource overutilization.

Databricks Runtime Version Updates:

  • Databricks Runtime versions receive maintenance updates periodically. These updates include bug fixes, performance enhancements, and security patches.
  • Automatic Updates: By default, clusters do not automatically update to the latest Databricks Runtime version. However, you can enable automatic cluster updates in your workspace settings1.
  • Manual Updates: To update an existing cluster, you can manually restart it with a newer Databricks Runtime version.
  • Version Compatibility: Ensure that your code and libraries are compatible with the chosen Databricks Runtime version. Sometimes, performance improvements in newer versions can positively impact your workload.

Your Experience with Version 13.2 vs. 14.0:

  • It’s interesting that you noticed better performance after switching back to version 13.2. This could be due to various factors:
    • Stability: Version 13.2 might be more stable for your specific use case.
    • Optimization: Some workloads perform better on specific runtime versions due to optimizations or changes.
    • Regression: Occasionally, newer versions introduce regressions that affect performance.
  • Experimentation: If performance is critical, consider experimenting with different runtime versions to find the optimal one for your workload.

Remember that performance can be influenced by various factors beyond just the runtime version, such as query complexity, data volume, and cluster configuration. Regular monitoring and tuning are essential for maintaining optimal performance. 🚀

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group