Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-30-2025 12:28 AM
Hey @alonisser, On Azure and GCP VACUUM, the deletion is performed in parallel on the driver when using Databricks Runtime 10.4 LTS or above. The higher the number of driver cores, the more the operation can be parallelised. But on AWS, deletes happen in batches, and the process is single-threaded. AWS uses a bulk delete API and deletes in batches of 1000, but it doesn’t use parallel threads. As a result, using a multi-core driver may not help on AWS.
For Best Practises on VACUUM, please refer - https://kb.databricks.com/en_US/delta/vacuum-best-practices-on-delta-lake