Ideally 7 days is recommended, but discuss with data stakeholders to identify what's suitable? 7/14/28 days.
To use VACCUM, first run some analytics on behaviour of your data.
Identify % of operations that perform updates and deletes vs insert operations. This will tell you how many files aren't being referenced anymore, and a cleanup is required.
To understand the efficiency of VACCUM run queries before and after VACCUM operation. Does it improve the performance time? Maybe try doing VACCUM at different intervals 7/14/28 days.
And definitely use VACUUM if there's a requirement to wipe out the data of system for particular scenarios.
As such there's no script. But, you can try this:
%python
#get list of all tables
table_list = spark.sql("show tables from <database name>").select('tableName').rdd.map(lambda x : x[0]).collect()
for table in table_name:
spark.sql(f"VACCUM {}".format(table))