cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Aviral-Bhardwaj
by Esteemed Contributor III
  • 513 Views
  • 2 replies
  • 19 kudos

�� Deltalake Vs Datalake in Databricks ��Delta Lake Databricks Delta Lake is an open-source storage layer that sits on top of existing d...

Deltalake Vs Datalake in Databricks Delta Lake DatabricksDelta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data Lake Store or Amazon S3. It provides a more robust and scalable alternative to tra...

  • 513 Views
  • 2 replies
  • 19 kudos
Latest Reply
Kaniz
Community Manager
  • 19 kudos

Awesome!

  • 19 kudos
1 More Replies
gbrueckl
by Contributor II
  • 3493 Views
  • 10 replies
  • 9 kudos

Slow performance of VACUUM on Azure Data Lake Store Gen2

We need to run VACCUM on one of our biggest tables to free the storage. According to our analysis using VACUUM bigtable DRY RUN this affects 30M+ files that need to be deleted.If we run the final VACUUM, the file-listing takes up to 2h (which is OK) ...

  • 3493 Views
  • 10 replies
  • 9 kudos
Latest Reply
Deepak_Bhutada
Contributor III
  • 9 kudos

@Gerhard Brueckl​ we have seen near 80k-120k file deletions in Azure per hour while doing a VACUUM on delta tables, it's just that the vacuum is slower in azure and S3. It might take some time as you said when deleting the files from the delta path. ...

  • 9 kudos
9 More Replies
Labels