remove empty folders with pyspark

Nathant93
New Contributor III

Hi,

I am trying to search a mnt point for any empty folders and remove them. Does anyone know of a way to do this? I have tried dbutils.fs.walk but this does not seem to work.

Thanks

Unfortunately this says that every folder in my mnt point has a size of 0. I have folders that have folders in which then might contain a metadata file for a streaming checkpoint.

MathieuDB
Databricks Employee
Databricks Employee

Hello @Nathant93,

You could use dbutils.fs.ls and iterate on all the directories found to accomplish this task.

Something like this:

def find_empty_dirs(path):
    directories = dbutils.fs.ls(path)
    for directory in directories:
        if directory.isDir():
            find_empty_dirs(directory.path)
            contents = dbutils.fs.ls(directory.path)
            if len(contents) == 0:
                # Logic
                dbutils.fs.rm(directory.path, recurse=True)
                print(f"Removed empty directory: {directory.path}")

find_empty_dirs("dbfs:/mnt/your_mount_point")