10-12-2022 12:19 PM
Want to know the best process of removal of files on ADLS after Optimize and Vacuum Dry run is completed
10-12-2022 12:49 PM
Credits to one of the community member from which I took the code of file existence
10-12-2022 12:53 PM
Want to know community members feedback on the below code which can work for specific table that is specified, this can be parameterized and run.
But is this the best way to manage (delete unwanted files of Delta tables that are externally stored in ADLS). Please let me know.
def file_exists_delete(path):
try:
dbutils.fs.ls(path)
dbutils.fs.rm(path)
print('removed the file '+path)
return True
except Exception as e:
if 'java.io.FileNotFoundException' in str(e):
return False
else:
raise
#Copy in Seperate Cell
spark.sql("OPTIMIZE tbl_name")
df=spark.sql("VACUUM tbl_name RETAIN 0 HOURS DRY RUN")
#Copy In seperate Cell
df_collect=df.collect()
#Copy in Seperate Cell and execute
for row in df_collect:
file_exists_delete(row[0])
10-13-2022 01:38 AM
do not remove files from delta lake tables manually. That is why vacuum exists.
It can lead to a corrupt table.
Why not just run a vacuum without the dry run?
10-13-2022 03:33 AM
vacuum will actually remove not used files (without the dry run option), depending on the retention interval.
check this topic
10-16-2022 11:56 AM
If you have external delta files, you can use Python syntax to clean them using path
from delta.tables import *
deltaTable = DeltaTable.forPath(spark, pathToTable)
deltaTable.vacuum()
11-19-2022 10:41 PM
Hi @Ravikanth Narayanabhatla
Hope all is well!
Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group