cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Optimal Azure VM type for EventHub streaming

Anonymous
Not applicable

Hello,

our spark jobs stream messages from Event Hub then transform it and finally the messages are peristed in storage. We plan to exercise cluster configurations for these jobs in order to find the optimal and procure Azure reservations. Furtemore, it is important to optimize the operaiton of these jobs in order to make a balanced usage of the DBUs.


1/ Do we need Multi-Node or Single Node configuration?

2/ What VM type is recomended for such workloads? Do we need memory or compute optimized or a balanced VM?

3/ If Multi-Node is recomended what type of VM for the Driver and what for the Workers?

4/ Does it make sense to enable AutoScaling?

5/ Does it make sense to enable Photon?

6/ Shall we run continuously the job or trigger it frequently? 

thanks

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group