cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Overwrite still saves numerous parquet files in storage container

Paully
New Contributor

I inherited this environment and my question is we have a job that mines the the data lake and creates a table that's is grouped by unit number and their data points. The job runs every 10 minutes. We then connect to that table direct query power bi and raise alarms in a model we have built in the app space. We are trying to optimize it we have an overwrite function but there are 100's parquet files in the container for each individual job runs equaling over 100gigs. Why? Wouldn't overwrite just recreate the same table or do we need to do a 'drop table if exist' in the script.

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.