cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

suresh1122
by New Contributor III
  • 7974 Views
  • 11 replies
  • 7 kudos

dataframe takes unusually long time to save as a delta table using sql for a very small dataset with 30k rows. It takes around 2hrs. Is there a solution for this problem?

I am trying to save a dataframe after a series of data manipulations using Udf functions to a delta table. I tried using this code( df .write .format('delta') .mode('overwrite') .option('overwriteSchema', 'true') .saveAsTable('output_table'))but this...

  • 7974 Views
  • 11 replies
  • 7 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 7 kudos

You should also look into the sql plan if the writing phase is indeed the part that is taking time. Since spark works on lazy evaluation, there might be some other phase that might be taking time

  • 7 kudos
10 More Replies
Anonymous
by Not applicable
  • 5292 Views
  • 9 replies
  • 6 kudos

Resolved! data frame takes unusually long time to write for small data sets

We have configured workspace with own vpc. We need to extract data from DB2 and write as delta format. we tried to for 550k records with 230 columns, it took 50mins to complete the task. 15mn records takes more than 18hrs. Not sure why this takes suc...

  • 5292 Views
  • 9 replies
  • 6 kudos
Latest Reply
elgeo
Valued Contributor II
  • 6 kudos

Hello. We face exactly the same issue. Reading is quick but writing takes long time. Just to clarify that it is about a table with only 700k rows. Any suggestions please? Thank youremote_table = spark.read.format ( "jdbc" ) \.option ( "driver" , "com...

  • 6 kudos
8 More Replies
Labels