How to clear all cache without restarting the cluster?

shan_chandra
Databricks Employee
Databricks Employee
 

shan_chandra
Databricks Employee
Databricks Employee
%scala
def clearAllCaching(tableName: Option[String] = None): Unit = {
tableName.map { path =>
com.databricks.sql.transaction.tahoe.DeltaValidation.invalidateCache(spark, path)
}
spark.conf.set("com.databricks.sql.io.caching.bucketedRead.enabled", "false")
spark.conf.set("spark.databricks.delta.smallTable.cache.enabled", "false")
spark.conf.set("spark.databricks.delta.stats.localCache.maxNumFiles", "1")
spark.conf.set("spark.databricks.delta.fastQueryPath.dataskipping.checkpointCache.enabled", "false")
spark.conf.set("spark.databricks.io.cache.enabled", "false")
com.databricks.sql.transaction.tahoe.DeltaLog.clearCache()
spark.sql("CLEAR CACHE")
sqlContext.clearCache()  
}

Please find the above piece of custom method to clear all the cache in the cluster without restarting . This will clear the cache by invoking the method given below.

%scala clearAllCaching()

The cache can be validated in the SPARK UI -> storage tab in the cluster.

View solution in original post

Great. You can add ​ spark.catalog.clearCache() as well.


My blog: https://databrickster.medium.com/