@Retired_mod Thank you for taking the time to address this issue.
We have observed that while running DESCRIBE HISTORY, there are instances where some Parquet files listed in the '_delta_log' JSON files are not physically present on S3. We need to identify which files are actually present on S3 and ensure they match the entries in the '_delta_log'.
Currently, our goal is to clean up unnecessary files from S3, both for Delta and non-Delta tables. We want to remove files that are not relevant or being used by any processes, as they are occupying significant space on S3. We have not used any retention policy yet from past 2-3 years.
Could you provide guidance on how to identify which files exist on S3 but are missing from the '_delta_log' and vice versa ? Additionally, any advice on safely deleting these redundant files would be greatly appreciated.