cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Fastest Azure VM for Databricks Big Data workload

alvaro_databric
New Contributor III

Hi All,

It is well known that Azure provides a wide variety of VM for Databricks, some of which provide powerful features such as Photon and Delta Caching. I would like to ask the community which do you think is the fastests cluster for performing Big Data operations (in the order of TB) among all the available options. In my opinion LasV3 family seems to be the best but I would like additional opinions.

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Alvaro Moureโ€‹ :

The performance of a Databricks cluster for big data operations depends on many factors, such as the amount and structure of the data, the nature of the operations being performed, the configuration of the cluster, and the specific resources available in each VM family.

That being said, the LasV3 family of VMs in Azure Databricks does offer some of the highest performing options for big data operations due to their large memory and high CPU power. However, this also makes them more expensive than other options. It's also worth noting that different use cases may have different requirements, and a smaller or less powerful cluster may be sufficient for certain tasks.

Ultimately, the best way to determine the fastest cluster for your specific use case is to benchmark and compare performance across different VM families and configurations.

View solution in original post

1 REPLY 1

Anonymous
Not applicable

@Alvaro Moureโ€‹ :

The performance of a Databricks cluster for big data operations depends on many factors, such as the amount and structure of the data, the nature of the operations being performed, the configuration of the cluster, and the specific resources available in each VM family.

That being said, the LasV3 family of VMs in Azure Databricks does offer some of the highest performing options for big data operations due to their large memory and high CPU power. However, this also makes them more expensive than other options. It's also worth noting that different use cases may have different requirements, and a smaller or less powerful cluster may be sufficient for certain tasks.

Ultimately, the best way to determine the fastest cluster for your specific use case is to benchmark and compare performance across different VM families and configurations.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group