cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

how to optimize the runtime in 10.4 cluster

databicky
Contributor II

i am loading the 1billion data from spark dataframe into target table, but in the 7.3 cluster it takes 3 hours to complete but after migrated to 10.4 cluster its taking 8 hours to complete , how can i reduce the time durationโ€‹

4 REPLIES 4

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, Please refer https://docs.databricks.com/clusters/cluster-config-best-practices.html for best practises for cluster configurations. Please let us know if this helps.

jose_gonzalez
Moderator
Moderator

Hi @Mohammed sadamuseanโ€‹,

Could you provide more details on what are you doing? What type of transformations/actions are you doing? whats your source and sink? batch or streaming? all that information will help.

i have data in adls, i load thise data into multiple dataframes in the databricks notebook, from the final dataframe i am loading data into final target table based on the dataframe tempview, usually it takes 3 in 7.3 cluster but in 10.4 cluster it take around 8 hours , 1 billion records is thereโ€‹

could you check your Spark UI to identify which stage is taking the longest time, and share some information in here

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.