Databricks Community

Brad · ‎10-05-2024

Hi,

I'm trying to test some SQL perf. I run below first

spark.conf.set('spark.databricks.io.cache.enabled', False)

However, the 2nd run for the same query is still way faster than the first time run. Is there a way to make the query start from a clean beginning without any cache?

Thanks

VZLA · ‎11-04-2024

Hi @Brad ,

It is not clear which cache storage is helping with running your query faster, so the most straightforward way is to reset the sparkContext. Alternatively, these are the three clear cache ways I can think from the top of my head:

// Clear all persistent RDDs from memory, you can verify its effectiveness by monitoring the Storage Tab in the Spark UI

spark.sparkContext.getPersistentRDDs.values.foreach(_.unpersist())

// Disable Databricks IO cache as you are currently doing.

spark.conf.set("spark.databricks.io.cache.enabled", false)

// Clear any cached tables or views if that is what its helping

spark.catalog.clearCache()

View solution in original post

VZLA · ‎11-04-2024

Hi @Brad ,

It is not clear which cache storage is helping with running your query faster, so the most straightforward way is to reset the sparkContext. Alternatively, these are the three clear cache ways I can think from the top of my head:

// Clear all persistent RDDs from memory, you can verify its effectiveness by monitoring the Storage Tab in the Spark UI

spark.sparkContext.getPersistentRDDs.values.foreach(_.unpersist())

// Disable Databricks IO cache as you are currently doing.

spark.conf.set("spark.databricks.io.cache.enabled", false)

// Clear any cached tables or views if that is what its helping

spark.catalog.clearCache()

Brad · ‎11-18-2024

Thanks @VZLA . How to run

spark.sparkContext.getPersistentRDDs.values.foreach(_.unpersist())

from databricks notebook?

Databricks Community

How to disable all cache

Photos

Connect with Databricks Users in Your Area

Data + AI Summit 2025 — registration now open!

Jumpstart Your Data Journey with Databricks Get Started Days!

Databricks DevConnect: Global Community Meetups for Data Engineers

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Introducing SAP Databricks