cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

While trying to save the spark dataframe to delta table is taking too long

Neil
New Contributor

While working on video analytics task I need to save the image bytes to the delta table earlier extracted into the spark dataframe. While I want to over write a same delta table over the period of complete task and also the size of input data differs. It is taking too much time even after doing several trials with compactions. I cant use the streaming delta tables as I simply want to store extracted image bytes to the delta table and simply complete the inference task for object detection and other transformations. I have even tried to drop the lengthy data columns but did not make any difference. 1 Driver

16 GB Memory, 4 Cores 11.3.x-gpu-ml-scala2.12, g4dn.xlarge is the configuration of my current cluster.

11.3.x-gpu-ml-scala2.12

1 REPLY 1

-werners-
Esteemed Contributor III

can you check the spark UI, to see where the time is spent?

It can be a join, udf, ...

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.