cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to optimize jobs performance

RajeshRK
Contributor

Hi Team,

We have a complex ETL job running in databricks for 6 hours. The cluster has the below configuration:

Minworkers: 16

Maxworkers: 24

Worker and Driver Node Type: Standard_DS14_v2. (16 cores, 128 GB RAM)

I have monitored the job progress in Spark UI for an hour, and my observations are below:

- The jobs are progressing and not stuck for a long time.  

- The workers nodes scaled up to 24 (max_workers configured)

- Shuffling (Read/Write) happens with a large amount of data.  (I Ran this job with spark.sql.shuffle.partitions 4000)

We are expecting the jobs should be completed within 4 hours. Any suggestions, please, to optimize the performance of the job?

Regards,

Rajesh.

7 REPLIES 7

Lakshay
Esteemed Contributor
Esteemed Contributor

Hi @Rajesh Kannan R​ , Can you check the Spark UI for the spark job where the job is spending most of the time. Also, look for any failed spark jobs in Spark UI.

Hi Lakshay,

Thank you for replying. One thing I noticed is in the job description in "Spark UI", each job with the below code takes an average of 15 minutes.

"save at StoreTransform.scala"

Not sure whether it is a custom code or a Databricks code.

Regards,

Rajesh.

Lakshay
Esteemed Contributor
Esteemed Contributor

Hi @Rajesh Kannan R​ , It looks like a custom code. Could you please share a task-level screenshot of one of these stages?

Hi Lakshay,

Unfortunately, I haven't captured it. I will share if I run the job next time.

Regards,

Rajesh.

Lakshay
Esteemed Contributor
Esteemed Contributor

Sure. You can also try the below suggestions:

  1. Use compute-optimized Node Type. Currently, you are using a Memory-optimized one.
  2. Run the job with spark.sql.shuffle.partitions auto

@Lakshay Goel​ 

Hi Lakshay,

It takes a couple of days to test this recommendation. I will try the job execution with new recommendations and update this thread.

Regards,

Rajesh.

Anonymous
Not applicable

Hi @Rajesh Kannan R​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you.

Thank you!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.